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* applicability; citations and explanations supporting such statement 
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Novelty (N) 



Inventive step (IS) 



Yes: 


Claims 


1-27,33-45 


No: 


Claims 


28-32 


Yes: 


Claims 


4-16,22,24,33-45 


No: 


Claims 


1-3,17-21,23,25-32 


Yes: 


Claims 


1-45 


No: 
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Section I 



1 . Basis of the opinion 

a. Originally filed documents also include pages 1 -57 of sequence listing. 

b. Sequence listing pages 1-57, filed with the letter of 19.10.99, do not form part of 
the application (Rule 13 ,er .1(f) PCT). 



Section V 

2. The applicant's observations submitted with the amended claims have been 
considered in establishing this report. 



3 R e f eren ce is made to the following documents: 

D1: Garbe and Stringer, Infect.lmmun., Vol.62, pp.3092-3101 (1994); 

D2: Chary-Reddy and Graves, J.CIin.Microbiol., Vol.34, pp. 1660-1 665 (1996); 

D3: Kovacs et al., J.Biol.Chem., Vol.268, pp.6034-6040 (1993). 



4. Novelty (Article 33(2) PCT) 

Claim 28 is directed to a nucleic acid molecule comprising a sequence selected 
from the group consisting of the given portions of SEQ ID NOs 1,3,5,7,9,1 1,13,15 
and sequences with at least 70% sequence identity with the said portions. The 
sequence shown in Fig.5a of D1 comprises a sequence highly homologous with 
those of the claim, for instance, nucleotides 2987-3232 show >96% homology with 
residues 2839-3084 of SEQ ID NO 7. Therefore the subject-matter of claim 28 is 
not novel over D1 . Similarly, the nucleic acid molecules of claims 29 and 30 are 
comprised in the msgl sequence of D1, which thereby renders said claims not 
novel; moreover, in view of the cloning methods used to gain said sequences (D1: 
Materials and Methods), the recombinant vector (claim 31) and cell containing 
said vector (claim 32) are also inevitably disclosed by D1. 
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5. Inventive step (Article 33(3) PCT) 

a Methods of detecting pathogenic microorganisms based on identification of 

specific DNA sequences, using PCR amplification and/or oligonucleotide probe 
hybridization, are common in the art. For instance, D2 describes the identification 
of P carinii from rat tissues by PCR amplification of a portion of the rat P. cannn 
msg gene (p.1660. Introduction, paragraph 2). The targeting of sequences wh.ch 
are conserved in, but unique to, the pathogenic organisms in any one disease is 
an important aspect of such an analysis, and indeed the primers in D2 were 
chosen according to specific homology with rat msg genes (p. 1663, col.1, 
paragraph 1). 

b The MSG protein of P. carinii is encoded by multiple related genes producing a 
family of closely related proteins, as disclosed in D3 for rat P. carinii. However, as 
pointed out in the present application (p.2, I.20-26), more variation occurs 
between msg genes isolated from different host-specific strains, so that 
sequences suitable for detection in the rat are not necessarily applicable to 
humans. 

c Claims 1 and 23 are directed to methods of detecting Pneumocystis carinii, in 
which a conserved region within human P. carinii is amplified using primers from 
the human P. carinii MSG protein encoding sequence (claim 1), or is hybridized 
with a probe for a conserved region within the human P. carinii MSG protein 
encoding sequence. However, these claims simply define the standard approach 
to identifying pathogenic microorganisms, applied here to a specific case, without 
defining the essential feature required to carry it out, i.e the conserved sequence. 

d. Moreover, D1 discloses a complete human P. carinii msg sequence, as well as 
several partial sequences: Fig.6 shows alignment of amino acid sequence data 
from four msg clones. Several portions clearly show a high degree of homology, 
which would be expected to extend also to the nucleic acid sequences. 

e. Thus, the skilled person seeking to detect human P. carinii would use standard 
methods and the sequence data provided in D1, concentrating on possible 
homologous regions in order to increase his chances of success. As such, the 
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subject-matter of claims 1 and 23 does not appear to be inventive. 

Dependent claims 2, 3 and 17-21 do not appear to contain any additional features 
which in combination with the features of the claims to which they refer, would 
render them inventive in the sense of Article 33(3) PCT, as said features are 
considered standard in the art. 

f Claims 25-27 are directed to the human MSG proteins 1 , 3. 1 1 , 14, 32, 33 and 35, 
and their corresponding nucleic acid sequences. Although these sequences have 
not previously been disclosed and are therefore novel, they are not considered to 
be inventive. The existence of a number of variants would be expected in the light 
of D3, and the availability of complete human MSG gene and protein sequences 
from D1 means that it would be routine practice for the skilled person to isolate 
such variants; this indeed is what appears to have been done in the present 
application. Therefore, claims 25-27 are not considered to be inventive. 

6a Claims 4-1 6, 22 and 24 each provide preferred embodiments of the claimed 
methods, using specific sequences, all of which are directed to a particular 
conserved region of the msg genes. Although the teaching of D1 might enable the 
skilled person to try certain portions of the genes based on the incomplete 
comparisons in Fig.6, he would not specifically be directed to the conserved 
region in question. Thus said claims appear to be novel and inventive. 

b Similarly the kits (claims 33-43) comprising primers taken from the specified 

conserved region are not anticipated by any prior art document, taken alone or in 
combination, and therefore seem to be new and inventive. 

c. The antibodies defined in claims 44 and 45 are raised against two specific 
sequences, providing different effects: one is unique to and thus specific for 
HMSG32 (claim 44); the other is for a conserved MSG epitope (claim 45). Both 
are novel and inventive, as the antigenic peptides defined are not indicated in the 
prior art. 

7 The document Mei et al., Infect. Immun.. Vol.66, pp.4268-4273 (Sept.1998), was 
cited as a P,X-document in the International Search Report. However, the priority 
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date of 17.08.98 for the present application is considered to be valid, so that the 
cited document does not count as prior art under Rule 64.1 PCT for the purposes 
of Article 33 PCT. 

Section VII 

8. Contrary to the requirements of Rule 5.1 (a)(ii) PCT, the relevant background art 
disclosed in the document D2 is not mentioned in the description, nor is this 
document identified therein. 

Section VIII 

9. The following objections are under Article 6 PCT: 

a. The phrase "and conservative substitutions thereof", used in claim 25, is vague 
and leaves the reader in doubt as to the exact nature of the subject-matter being 
claimed, thereby rendering the definition of the subject-matter of said claims 
unclear. Although conservative substitutions are discussed in the description 

(p. 13-1 4), it is unclear in what way such substitutions may be limited. In particular, 
this same passage refers to sequences of at least 63% homology (which would 
include the MSG disclosed in D1), and thereby implies that the subject-matter for 
which protection is sought may be different to that defined by claims 25-27, i.e. not 
restricted to the sequences given. Therefore, a lack of clarity in the claims arises 
when using the description to interpret them. 

b. Claims 4-7, 28 and 33-35 refer to residues 2887-31 32 of HMSG33 (SEQ ID 
NO:1 1), although said SEQ ID NO:1 1 extends only as far as residue 3054. Said 
claims are therefore unclear. 

c. The vague and imprecise statement in the description, p.30, 1.9-13, referring to the 
"spirit" of the invention, implies that the subject-matter for which protection is 
sought may be different to that defined by the claims, thereby resulting in lack of 
clarity (Article 6 PCT) when used to interpret them (see also PCT Guidelines, C- 
III, 4.3a). 
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CLAIMS 



We claim: 

A method of detecting the presence of Pneumocystis carinii in a biological 



1. 



specimen, comprising: 

amplifying a highly conserved region within a human-P. cannn nucleic acd 
sequence, if such sequence is present in the sample, using two or more oligonucleotide pruners 
derived from human-/>. carinii MSG protein encoding sequence; and 

determining whether an amplified sequence is present. 

2 . me method according to claim 1, wherein amplification of the human-* cannn 
nucleic acid sequence is by polymerase chain reaction. 

3. The method of claim 1, wherein the human-P. carinii nucleic acid sequence is a 
highly conserved region within an MSG-protein encoding sequence. 

4 The method of claim 3, wherein the highly conserved region comprises a sequence 

selectedfrommegroupcon^^ 

of HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSGU (SEQ IDNO: 5), 2839-3084 of HMSG14 
(SEQ^D NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 2*74132 of HMSG33 (SEQ ID NO: 
, „ 2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of HMSG P 2 (SEQ 1DNO: 15). 

' 5 The method of claim 1, wherein at least one oligonucleotide primer comprises at 

least ,5 contiguous nucleotides from a sequence chosen from the group consfeti ^ 
3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSG P 3 (SEQ IDNO:3), 2845-3090 of HA^G/ 
(SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 
2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of 
HMSGv2 (SEQ ID NO: 1 5) and nucleic acid sequences having at least 70'/. sequence homology with 

^2^94 3042 MW^m^W^*^™™*™ 
3M90 of HMSG1 1 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836-3081 of HMSG32 
(SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 (SEQ ID NO: 

13) and 1-249 of HMSGp2 (SEQ ID NO: 15). 

' 6 The method of claim 5, wherein at least one oligonucleotide primer comprises at 

.east 15 contiguous nucleotides from a nucleic acid sequence having at least 90% sequence homology 
with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID NO: 3) 
2845-3090 of HMSGU (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ 

HMSG32 (SEQ ID NO: 9), 2887-31 32 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 (SEQ 
ID NO- 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

7 The method of claim 5, wherein at least one oligonucleotide primer compnses at 

.east 15 contiguous nuclides from a nucleic acid sequence having at least 95% sequence homology 
with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 <W™** 
1^090 of 
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HMSG32 (SEQ ID NO: 9), 2887-3 132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 (SEQ 
ID NO- 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

8 The method of claim 5, wherein the oligonucleotide primers are chosen from the 

group consisting of: SEQ IDNO: 17, SEQ IDNO: 18, SEQ IDNO: 19, SEQ IDNO:20, SEQ ID 
5 NO: 23, and SEQ ID NO: 24. 

9. The method of claim 5, wherein the pair of oligonucleotide primers consist of one 

upstream primer and one downstream primer. 

1 0. The method of claim 9, wherein: 

the upstream primer is chosen from the group consisting of: SEQ ID NO: 

10 17 SEQ ID NO: 18, SEQ IDNO:19, SEQ lDNO:23; and 

the downstream primer is chosen from the group consisting of: SEQ ID 

NO: 20 and SEQ ID NO: 24. 

11. The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 

IDNO: 17. . 
15 12 . The method of claim 8, wherein one of the oligonucleotide praners comprises SEQ 

IDNO: 18. 

13. The method of claim 8, wherein one of the oligonucleotide primers compnses SEQ 
IDNO: 19. 

14. The method of claim 8, wherein one of the oligonucleotide primers compnses SEQ 

20 ID NO: 20. __ 0 

1 5. The method of claim 8, wherein one of the oligonucleotide primers compnses SEQ 

ID NO: 23. 

16. The method of claim 8, wherein one of the oligonucleotide primers compnses SEQ 
ID NO: 24. 

method of claim 1, wherein the biological specimen is from the oropharyngeal 



17. The 

tract. 

1 8. The method of claim 1, wherein the biological specimen is from blood. 

19. The method of claim 1, wherein the step of determining whether an amplified 

sequence is present comprises one or more of: 

(a) electrophoresis and staining of the amplified sequence; or 

(b) hybridization to a labeled probe of the amplified sequence. 

20. The method of claim 19, wherein the amplified sequence is detected by 

hybridization to a labeled probe. 

21. The method of claim 22, wherein the probe comprises a detectable nomotopic 

35 label chosen from the group consisting of: 

a fluorescent molecule; 
a chemiluminescent molecule; 
an enzyme; 
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a co-factor; 

an enzyme substrate; and 
a hapten. 

22. The method of claim 21 , wherein the labeled probe comprises a nucleic acid 

5 sequence according to SEQ ID NO: 19. 

23. A method of detecting the presence of Pneumocystis carinii in a biological 

specimen, comprising: 

exposing the biological specimen to a probe that hybridizes to a highly conserved 
region within a human-/>. carinii nucleic acid sequence, if the sequence is present in the sample to 

10 form a hybridization complex; and 

determining whether the hybridization complex is present 
wherein the nucleic acid sequence derived from human-P. carinii is an MSG encoding 

sequence. . 

24. The method of claim 23, wherein the labeled probe comprises a nucleic acid 

1 5 sequence according to SEQ ID NO: 1 9. 

25. A purified protein comprising an amino acid sequence selected from the group 

consisting of 

(a) SEQ ID NO: 2; 

(b) SEQ ID NO: 4; 
20 (c) SEQ ID NO: 6; 

(d) SEQ ID NO: 8; 

(e) SEQ ID NO: 10; 

(f) SEQ ID NO: 12; 

(g) SEQ ID NO: 14; 

25 and conservative substitutions thereof. 

26. An isolated nucleic acid molecule encoding a protein according to claim 25. 

27 The isolated nucleic acid molecule according to claim 26, wherein the nucleic acid 
molecule has a sequence selected from the group consisting of: SEQ ID NO: 1; SEQ ID NO: 2; SEQ 
ID NO: 3; SEQ ID NO: 4, SEQ ID NO: 5; SEQ ID NO: 6, SEQ ID NO: 7; SEQ ID NO: 15; and SEQ 

30 ID NO: 17. 

28 An isolated nucleic acid molecule comprising a sequence selected from the group 
consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID 
NO" 3) 2845-3090 of HMSG 1 1 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836- 
3081 of HMSG 32 (SEQ ID NO: 9), 2887-3132 of HMSG 3 3 (SEQ ID NO: 1 1), 2821-3072 of 

35 HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15); and a sequence with at least 

70-/. sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of 
HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSG 14 (SEQ 
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IDNO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 11), 
2821-3072 of HMSG35 (SEQ ID NO: 1 3), and 1-249 of HMSGp2 (SEQ ID NO: 1 5). 

29. An isolated nucleic acid molecule comprising a sequence selected from the group 
consisting of: at least 15 contiguous nucleotides of the nucleic acid molecule according to claim 28. 

30. An isolated nucleic acid molecule comprising a sequence selected from the group 
consisting of: at least 20 contiguous nucleotides of the nucleic acid molecule according to claim 29. 

31. A recombinant vector comprising the nucleic acid molecule according to claim 28. 

32. A transgenic cell comprising the vector according to claim 3 1 . 
33 A kit for detecting a human-/*, carinii nucleic acid sequence comprising at least a 

pair of primers each comprising at least 15 contiguous nucleotides of sequence selected from the 
group consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ 
ID NO: 3), 2845-3090 of HMSG1 1 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836- 
3081 0 f//MSG32(SEQlDNO:9),2887-3132of//^5GJ3(SEQIDNO: 11), 2821-3072 of 
HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15); and a sequence with at least 
70% sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of 
HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSGU (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ 
ID NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 2887-3 132 of HMSG33 (SEQ ID NO: 1 1), 
2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

34. A kit for detecting a human-/*, carinii nucleic acid sequence comprising at least a 
pair of primers each comprising at least 20 contiguous nucleotides of sequence selected from the 
group consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGpS (SEQ 
ID NO: 3), 2845-3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSGU (SEQ ID NO: 7), 2836- 
3081 of HMSG32 (SEQ ID NO: 9), 2887-3 1 32 of HMSG33 (SEQ ID NO: 11), 2821-3072 of 
HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15); and a sequence with at least 
70% sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of 
HMSG P 3 (SEQ ID NO: 3), 2845-3090 Of HMSGU (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ 
ID NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 
2821-3072 of HMSG3S (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

35. A kit for detecting a human-/>. carinii nucleic acid sequence comprising at least a 
pair of primers each comprising at least 30 contiguous nucleotides of sequence selected from the 
group consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGpS (SEQ 
ID NO: 3), 2845-3090 of HMSGU (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836- 
3081 0 f/yMSGJ2(SEQIDNO:9),2887-3132ofWMSGJ3(SEQIDNO: 1 1 ), 282 1 -3072 of 
HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15); and a sequence with at least 
35 70% sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of 

HMSG P 3 (SEQ ID NO: 3), 2845-3090 of HMSGU (SEQ ID NO: 5), 2839-3084 of HMSGU (SEQ 
ID NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 
2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 
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36 The kit of claim 33, wherein at least one of the oligonucleotide primers comprises a 
sequence selected from the group consisting of: SEQ ID NO: 17;SEQlDNO: 18.SEQ1DNO: 19; 
SEQ ID NO: 20; SEQ ID NO: 21 ; SEQ ID NO: 22; SEQ ID NO: 23; and SEQ ID NO: 24. 

37. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 

sequence according to SEQ ID NO: 17. 

38. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 

sequence according to SEQ ID NO: 18. 

39. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 

sequence according to SEQ ID NO: 19. 

40. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 

sequence according to SEQ ID NO: 21. 

41. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 

sequence according to SEQ ID NO: 22. 

42. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 

sequence according to SEQ ID NO: 23. 

43. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 

sequence according to SEQ ID NO: 24. 

44. Antibody raised against the peptide sequence according to SEQ ID NO: 25. 

45. Antibody raised against the peptide sequence according to SEQ ID NO: 26. 
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INTERNATIONAL PRELIMINARY 

EXAMINATION REPORT International application No. PCT/US99/18750 

I. Basis of the report 

1 . This report has been drawn on the basis of (substitute sheets which have been furnished to the receiving Office in 
response to an invitation under Article 14 are referred to in this report as "originally filed" and are not annexed to 
the report since they do not contain amendments.): 

Description, pages: 

1 -30 as originally filed 

Claims, No.: 

1 -45 as amended under Article 1 9 

Drawings, sheets: 

1 /1 3-1 3/1 3 as originally filed 

2. The amendments have resulted in the cancellation of: 

□ the description, pages: 

□ the claims, Nos.: 

□ the drawings, sheets: 

3. □ This report has been established as if (some of) the amendments had not been made, since they have been 

considered to go beyond the disclosure as filed (Rule 70.2(c)): 

4. Additional observations, if necessary: 

see separate sheet 
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V. Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial 
applicability; citations and explanations supporting such statement 



1. Statement 

Novelty (N) Yes: Claims 1-27,33-45 

No: Claims 28-32 

Inventive step (IS) Yes: Claims 4-16,22,24,33-45 

No: Claims 1-3,17-21,23,25-32 

Industrial applicability (IA) Yes: Claims 1-45 

No: Claims 



2. Citations and explanations 



see separate sheet 



VII- Certain defects in the international application 

The following defects in the form or contents of the international application have been noted: 
see separate sheet 

VIII. Certain observations on the international application 

The following observations on the clarity of the claims, description, and drawings or on the question 
claims are fully supported by the description, are made: 

see separate sheet 
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INTERNATIONAL PRELIMINARY International application No. PCT/US99/1 8751 
EXAMINATION REPORT - SEPARATE SHEET 

Section I 

1 . Basis of the opinion 

a. Originally filed documents also include pages 1-57 of sequence listing. 

b. Sequence listing pages 1-57, filed with the letter of 19.10.99, do not form part of 
the application (Rule 13 ,er .1(f) PCT). 



Section V 



The applicant's observations submitted with the amended claims have been 
considered in establishing this report. 



3. Reference is made to the following documents: 

D1: Garbe and Stringer, Infect. Immun., Vol.62, pp.3092-3101 (1994); 

D2: Chary-Reddy and Graves, J.CIin.Microbiol., Vol.34, pp.1660-1665 (1996); 

D3: Kovacs et al.. J.Biol.Chem., Vol.268, pp.6034-6040 (1993). 

4. Novelty (Article 33(2) PCT) 

Claim 28 is directed to a nucleic acid molecule comprising a sequence selected 
from the group consisting of the given portions of SEQ ID NOs 1,3,5,7,9,11,13,15 
and sequences with at least 70% sequence identity with the said portions. The 
sequence shown in Fig.5a of D1 comprises a sequence highly homologous with 
those of the claim, for instance, nucleotides 2987-3232 show >96% homology with 
residues 2839-3084 of SEQ ID NO 7. Therefore the subject-matter of claim 28 is 
not novel over D1. Similarly, the nucleic acid molecules of claims 29 and 30 are 
comprised in the msgl sequence of D1 , which thereby renders said claims not 
novel; moreover, in view of the cloning methods used to gain said sequences (D1: 
Materials and Methods), the recombinant vector (claim 31) and cell containing 
said vector (claim 32) are also inevitably disclosed by D1 . 
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INTERNATIONAL PRELIMINARY international application No. PCT/US99/18750 

EXAMINATION REPORT - SEPARATE SHEET 

5. Inventive step (Article 33(3) PCT) 

a Methods of detecting pathogenic microorganisms based on identification of 
specific DNA sequences, using PCR amplification and/or oligonucleotide probe 
hybridization, are common in the art. For instance, D2 describes the identification 
of P carinii from rat tissues by PCR amplification of a portion of the rat P. carinii 
msg gene (p. 1660, Introduction, paragraph 2). The targeting of sequences wh.ch 
are conserved in, but unique to, the pathogenic organisms in any one d.sease is 
an important aspect of such an analysis, and indeed the primers in D2 were 
chosen according to specific homology with rat msg genes (p. 1663, col.1, 
paragraph 1). 

b The MSG protein of P. carinii is encoded by multiple related genes producing a 
family of closely related proteins, as disclosed in D3 for rat P. carinii. However, as 
pointed out in the present application (p.2, 1.20-26), more variation occurs 
between msg genes isolated from different host-specific strains, so that 
sequences suitable for detection in the rat are not necessarily applicable to 
humans. 

Claims 1 and 23 are directed to methods of detecting Pneumocystis carinii, in 
which a conserved region within human P. carinii is amplified using primers from 
the human P. carinii MSG protein encoding sequence (claim 1), or is hybrid.zed 
with a probe for a conserved region within the human P. carinii MSG protein 
encoding sequence. However, these claims simply define the standard approach 
to identifying pathogenic microorganisms, applied here to a specific case, wrthout 
defining the essential feature required to carry it out, i.e the conserved sequence. 

Moreover, D1 discloses a complete human P. carinii msg sequence, as well as 
several partial sequences: Fig.6 shows alignment of amino acid sequence data 
from four msg clones. Several portions clearly show a high degree of homology, 
which would be expected to extend also to the nucleic acid sequences. 

Thus, the skilled person seeking to detect human P. carinii would use standard 
methods and the sequence data provided in D1 , concentrating on possible 
homologous regions in order to increase his chances of success. As such, the 
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subject-matter of claims 1 and 23 does not appear to be inventive. 

Dependent claims 2, 3 and 17-21 do not appear to contain any additional features 
which, in combination with the features of the claims to which they refer, would 
render them inventive in the sense of Article 33(3) PCT, as said features are 
considered standard in the art. 

f . Claims 25-27 are directed to the human MSG proteins 1 , 3, 1 1 , 1 4, 32, 33 and 35, 
and their corresponding nucleic acid sequences. Although these sequences have 
not previously been disclosed and are therefore novel, they are not considered to 
be inventive. The existence of a number of variants would be expected in the light 
of D3, and the availability of complete human MSG gene and protein sequences 
from D1 means that it would be routine practice for the skilled person to isolate 
such variants; this indeed is what appears to have been done in the present 
application. Therefore, claims 25-27 are not considered to be inventive. 

6a. Claims 4-1 6, 22 and 24 each provide preferred embodiments of the claimed 
methods, using specific sequences, all of which are directed to a particular 
conserved region of the msg genes. Although the teaching of D1 might enable the 
skilled person to try certain portions of the genes based on the incomplete 
comparisons in Fig. 6, he would not specifically be directed to the conserved 
region in question. Thus said claims appear to be novel and inventive. 

b. Similarly the kits (claims 33-43) comprising primers taken from the specified 
conserved region are not anticipated by any prior art document, taken alone or in 
combination, and therefore seem to be new and inventive. 

c. The antibodies defined in claims 44 and 45 are raised against two specific 
sequences, providing different effects: one is unique to and thus specific for 
HMSG32 (claim 44); the other is for a conserved MSG epitope (claim 45). Both 
are novel and inventive, as the antigenic peptides defined are not indicated in the 
prior art. 

7. The document Mei et al., Infect.lmmun., Vol.66, pp.4268-4273 (Sept.1998), was 
cited as a P,X-document in the International Search Report. However, the priority 
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date of 17.08.98 for the present application is considered to be valid, so that the 
cited document does not count as prior art under Rule 64.1 PCT for the purposes 
of Article 33 PCT. 

Section VII 

8. Contrary to the requirements of Rule 5.1 (a)(ii) PCT, the relevant background art 
disclosed in the document D2 is not mentioned in the description, nor is this 
document identified therein. 

Section VIII 

9. The following objections are under Article 6 PCT: 

a. The phrase "and conservative substitutions thereof", used in claim 25, is vague 
and leaves the reader in doubt as to the exact nature of the subject-matter being 
claimed, thereby rendering the definition of the subject-matter of said claims 
unclear. Although conservative substitutions are discussed in the description 

(p. 13-1 4), it is unclear in what way such substitutions may be limited. In particular, 
this same passage refers to sequences of at least 63% homology (which would 
include the MSG disclosed in D1), and thereby implies that the subject-matter for 
which protection is sought may be different to that defined by claims 25-27, i.e. not 
restricted to the sequences given. Therefore, a lack of clarity in the claims arises 
when using the description to interpret them. 

b. Claims 4-7, 28 and 33-35 refer to residues 2887-3132 of HMSG33 (SEQ ID 
NO:1 1), although said SEQ ID NO:1 1 extends only as far as residue 3054. Said 
claims are therefore unclear. 

c. The vague and imprecise statement in the description, p.30, 1.9-13, referring to the 
"spirit" of the invention, implies that the subject-matter for which protection is 
sought may be different to that defined by the claims, thereby resulting in lack of 
clarity (Article 6 PCT) when used to interpret them (see also PCT Guidelines, C- 
III, 4.3a). 
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1 . This written opinion is the first drawn up by this International Preliminary Examining Authority. 
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II 
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III 
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IV 
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V 




VI 
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VII 
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Reasoned statement under Rule 66.2(a)(ii) with regard to novelty, inventive step or industrial applicability; 
citations and explanations supporting such statement 

Certain document cited DOCK0^ ^ OR ^U^^^^ 

Certain defects in the international application 

Certain observations on the international application CDH n UT^ 

BOOK ^4^T 



3. The applicant is hereby invited to reply to this opinion. 
When? 



See the time limit indicated above. The applicant may. before the expiration of that time tirfiift — 



request this Authority to grant an extension, see Rule 66.2(d). ^Kt>R - 

t^n sve 

How? By submitting a written reply, accompanied, where appropriate, by amendments, according to Rule 66.3. 

For the form and the language of the amendments, see Rules 66.8 and 66.9. 

Also: For an additional opportunity to submit amendments, see Rule 66.4. 

) For the examiner's obligation to consider amendments and/or arguments, see Rule 66.4 bis. 

For an informal communication with the examiner, see Rule 66.6. 
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t. Basis of the opinion 

1 . This opinion has been drawn on the basis of {substitute sheets which have been furnished to the receiving Office 
in response to an invitation under Article 14 are referred to in this opinion as "originally filed".): 

Description, pages: 

1 -30 as originally filed 

Claims, No.: 

-I _ 45 as received on 1 6/03/2000 with letter of 1 3/03/2000 

Drawings, sheets: 

1 /1 3-1 3/1 3 as originally filed 



2. The amendments have resulted in the cancellation of: 

□ the description, pages: r 

□ the claims, Nos.: 

□ the drawings, sheets: 

3. This opinion has been established as if (some of) the amendments had not been made, since they have been 
considered to go beyond the disclosure as filed (Rule 70.2(c)): 

4. Additional observations, if necessary: 
see separate sheet 

V. Reasoned statement under Rule 66.2(a)(ii) with regard to novelty, inventive step or industrial 
applicability; citations and explanations supporting such statement 

1. Statement 

Novelty (N) Claims 28-32 (NO) 

Inventive step (IS) Claims 1-3,17-21,23,25-27 (NO) 

Industrial applicability (IA) Claims 

2. Citations and explanations 
see separate sheet 
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VII. Certain defects in the international application 

The following defects in the form or contents of the international application have been noted: 
see separate sheet 

VIII- Certain observations on the international application 

The following observations on the clarity of the claims, description, and drawings or on the question whether the 
claims are fully supported by the description, are made: 

see separate sheet 
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Section I 



1. Basis of the opinion 

Originally filed documents also include pages 1-57 of sequence listing. 



a. 



b. Sequence listing pages 1-57, filed with the letter of 19.10.99, do not form part of 
the application (Rule 1 3 ,er . 1 (f) PCT). 



Section V 

2. The applicant's observations submitted with the amended claims have been 
considered in establishing this written opinion. 

3. Reference is made to the following documents: 

D1: Garbe and Stringer, Jnfect.lmmun., Vol.62, pp.3092-3101 (1994); 

D2: Chary-Reddy and Graves, J.CIin.Microbiol., Vol.34, pp.1 660-1 665 (1996); 

D3: Kovacs et al., J.Biol.Chem., Vol.268, pp.6034-6040 (1993). 



4. Novelty (Article 33(2) PCT) 

Claim 28 is directed to a nucleic acid molecule comprising a sequence selected 
from the group consisting of the given portions of SEQ ID NOs 1,3,5,7,9,1 1,13,15 
and sequences with at least 70% sequence identity with the said portions. The 
sequence shown in Fig.5a of D1 comprises a sequence highly homologous with 
those of the claim, for instance, nucleotides 2987-3232 show >96% homology with 
residues 2839-3084 of SEQ ID NO 7. Therefore the subject-matter of claim 28 is 
not novel over D1 . Similarly, the nucleic acid molecules of claims 29 and 30 are 
comprised in the msgl sequence of D1, which thereby renders said claims not 
novel; moreover, in view of the cloning methods used to gain said sequences (D1: 
Materials and Methods), the recombinant vector (claim 31) and cell containing 
said vector (claim 32) are also inevitably disclosed by D1 . 
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5. Inventive step (Article 33(3) PCT) 

a. Methods of detecting pathogenic microorganisms based on identification of 

specific DNA sequences, using PCR amplification and/or oligonucleotide probe 
hybridization, are common in the art. For instance, D2 describes the identification 
of P. carinii from rat tissues by PCR amplification of a portion of the rat P. carinii 
msg gene (p. 1660, Introduction, paragraph 2). The targeting of sequences which 
are conserved in, but unique to, the pathogenic organisms in any one disease is 
an important aspect of such an analysis, and indeed the primers in D2 were 
chosen according to specific homology with rat msg genes (p. 1663, col.1, 
paragraph 1). 



b. The MSG protein of P. carinii is encoded by multiple related genes producing a 
family of closely related proteins, as disclosed in D3 for rat P. carinii. However, as 
pointed out in the present application (p.2, I.20-26), more variation occurs 
between msg genes isolated from different host-specific strains, so that 
sequences suitable for detection in the rat are not necessarily applicable to 
humans. 

c. Claims 1 and 23 are directed to methods of detecting Pneumocystis carinii, in 
which a conserved region within human P. carinii is amplified using primers from 
the human P. carinii MSG protein encoding sequence (claim 1), or is hybridized 
with a probe for a conserved region within the human P. carinii MSG protein 
encoding sequence. However, these claims simply define the standard approach 
to identifying pathogenic microorganisms, applied here to a specific case, without 
defining the essential feature required to carry it out, i.e the conserved sequence. 

d. Moreover, D1 discloses a complete human P. carinii msg sequence, as well as 
several partial sequences: Fig.6 shows alignment of amino acid sequence data 
from four msg clones. Several portions clearly show a high degree of homology, 
which would be expected to extend also to the nucleic acid sequences. 

e. Thus, the skilled person seeking to detect human P. carinii would use standard 
methods and the sequence data provided in D1, concentrating on possible 
homologous regions in order to increase his chances of success. As such, the 
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subject-matter of claims 1 and 23 does not appear to be inventive. 

Dependent claims 2, 3 and 17-21 do not appear to contain any additional features 
which, in combination with the features of the claims to which they refer, would 
render them inventive in the sense of Article 33(3) PCT, as said features are 
considered standard in the art. 

f . Claims 25-27 are directed to the human MSG proteins 1 , 3, 1 1 , 1 4, 32, 33 and 35, 
and their corresponding nucleic acid sequences. Although these sequences have 
not previously been disclosed and are therefore novel, they are not considered to 
be inventive. The existence of a number of variants would be expected in the light 
of D3, and the availability of complete human MSG gene and protein sequences 
from D1 means that it would be routine practice for the skilled person to isolate 
such variants; this indeed is what appears to have been done in the present 
application. Therefore, claims 25-27 are not considered to be inventive. 

6a. Claims 4-16, 22 and 24 each provide preferred embodiments of the claimed 
methods, using specific sequences, all of which are directed to a particular 
conserved region of the msg genes. Although the teaching of D1 might enable the 
skilled person to try certain portions of the genes based on the incomplete 
comparisons in Fig.6, he would not specifically be directed to the conserved 
region in question. Thus said claims appear to be novel and inventive. 

b. Similarly the kits (claims 33-43) comprising primers taken from the specified 
conserved region are not anticipated by any prior art document, taken alone or in 
combination, and therefore seem to be new and inventive. 

c. The antibodies defined in claims 44 and 45 are raised against two specific 
sequences, providing different effects: one is unique to and thus specific for 
HMSG32 (claim 44); the other is for a conserved MSG epitope (claim 45). Both 
are novel and inventive, as the antigenic peptides defined are not indicated in the 
prior art. 

7. The document Mei et al., Infect.lmmun., Vol.66, pp.4268-4273 (Sept.1998), was 
cited as a P,X-document in the International Search Report. However, the priority 
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date of 17 08.98 for the present application is considered to be valid, so that the 
cited document does not count as prior art under Rule 64.1 PCT for the purposes 
of Article 33 PCT. 



Section VII 

8. Contrary to the requirements of Rule 5.1 (a)(ii) PCT, the relevant background 
disclosed in the document D2 is not mentioned in the description, nor is this 
document identified therein. 

Section VIII 



9. The following objections are under Article 6 PCT: 

a The phrase "and conservative substitutions thereof", used in claim 25, is vague 
and leaves the reader in doubt as to the exact nature of the subject-matter being 
claimed, thereby rendering the definition of the subject-matter of said claims 
unclear. Although conservative substitutions are discussed in the description 
(p 13-14), it is unclear in what way such substitutions may be limited. In particular, 
this same passage refers to sequences of at least 63% homology (which would 
include the MSG disclosed in D1), and thereby implies that the subject-matter for 
which protection is sought may be different to that defined by claims 25-27, i.e. not 
restricted to the sequences given. Therefore, a lack of clarity in the claims arises 
when using the description to interpret them. 

b. Claims 4-7, 28 and 33-35 refer to residues 2887-31 32 of HMSG33 (SEQ ID 
NO:1 1), although said SEQ ID NO:1 1 extends only as far as residue 3054. 

c. The vague and imprecise statement in the description, p.30, 1.9-13, referring to the 
"spirit" of the invention, implies that the subject-matter for which protection is 
sought may be different to that defined by the claims, thereby resulting in lack of 
clarity (Article 6 PCT) when used to interpret them (see also PCT Guidelines, C- 
III, 4.3a). 
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IDENTIFICATION OF A REGION OF THE MAJOR SURFACE 
GLYCOPROTEIN (MSG) GENE 
OF HUMAN PNEUMOCYSTIS CARINII 

FIELD OF THE INVENTION 

This invention relates to methods for detecting Pneumocystis carinii infection in humans, 
specifically to such methods that involve polymerase chain reaction or other amplification of nucleic 
acid sequences that encode a Pneumocystis carinii sp. f. hominis protein. 



BACKGROUND OF THE INVENTION 

Pneumocystis carinii is an important life threatening opportunistic pathogen of 
immunocompromised patients, especially those with human immunodeficiency virus (HIV) infection. 
Conventional diagnosis of Pneumocystis carinii pneumonia (PCP) involves analysis of a tissue 

15 sample or oropharyngeal secretion sample for the presence of a P. carinii organism through staining 

and microscopic examination. Sample acquisition techniques have included such invasive methods 
as transbronchial biopsy, percutanenous lung biopsy, or open lung biopsy. Each of these techniques 
is fraught with possible complications and requires significant time and expense. In the mid 1980's, 
bronchoalveolar lavage (BAL) was introduced as a less invasive, less expensive, and less 

20 complication-prone technique for acquiring samples to be used in PCP diagnosis (Ognibene et al 

(1984) Am. Rev. Respir. Dis. 129:929-932). However BAL, coupled with bronchoscopy, still 
required special equipment and facilities, as well as the time of a physician and technician. Simpler 
still, it is now known that the Pneumocystis organism can also be detected in induced sputum samples 
(Bigbye/a/. (1986) Am. Rev. Respir. Dis. 133:515-518; Kovacs et al. (1988) AT£/M318:589-593). 

25 Advances also have occurred in the techniques used to detect the Pneumocystis organism in 

tissue and oropharyngeal secretion samples. Direct microscopic examination of clinical samples 
stained with, for instance, Giemsa stain or toluidine blue O, requires time-consuming sample 
preparation and subsequent examination by specially trained and experienced microscopy technicians 
(see, for instance, Bigby et al ( 1 986) Am. Rev. Respir. Dis. 133:5 15-5 1 8). This procedure has been 

30 somewhat simplified and rendered more amenable to mechanization through the use of monoclonal 

antibodies in detection of P. carinii antigens in clinical samples (Kovacs et al. (1 988) NEJM 
318:589-593). A few groups have used oligonucleotide probes complementary to P. carinii 
nucleotide sequences to detect the organism through hybridization, as in U. S. Pat. No. 5,164,490 (the 
Santi patent). 

35 Polymerase chain reaction (PCR) -mediated amplification of DNA or RNA-encoding 

sequences has been used to diagnose various diseases including leprosy (Santos et al (1997) J. Med. 
Microbiol. 46:170-172) and PCP. This technique exhibits increased sensitivity over simple probe 
hybridization methods. Primers complementary to sequences encoding P. carinii mitochondrial or 
chromosomal ribosomal RNA (rRN A) have been used to amplify Pneumocystis-specific DNA 
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sequence, as in Wakefield et al (1990) Mol. Biochem. Parasit 43:69-76; Wakefield et al (1990) 
Lancet 336:451-453; Lipschik et al (1992) Lancet 340:203-206; WO 91/19005; and U.S. Pat. Nos. 
5,5 1 9, 1 27 (the Shah patent), 5,593,836 (the Niemiec patent) and 5,776,680 (the Leibowitz patent). 

Other recent research advances relate to elucidating the molecular mechanisms involved in 
P. carinii infection. A great deal of interest has focused on the major surface glycoprotein (MSG; 
also called glycoprotein A) of P, carinii, because it is considered to be both a virulence factor and a 
target of host immune responses. MSG is the most abundant protein expressed on the surface of P. 
carinii, as assessed by Coomassie blue staining. It appears to play a critical role in the pathogenesis 
of pneumocystosis, possibly by acting as an attachment ligand to lung cells. MSG is also a target of 
both humoral and cellular immune responses by the host. 

Multiple genes encode the MSG of rat-P. carinii, and different MSGs may be expressed in 
the lung of a rat infected with P. carinii (Angus et al. (1996) J. Exp. Med 183:1229-1234; Kovacs et 
al (1993)7. Biol Chem. 268:6034-6040). Similarly, multiple genes encode the MSG of P. carinii 
infecting ferrets and mice (Haidaris et al (1998) DNA Res. 5:77-85; Haidaris et al (1992) J. Infect 
Dis. 166: 1113-1 123). Additional studies have shown that there is a single genomic site for 
expression of rat MSG variants (Edman et al (1996) DNA Cell Biol 15:989-999; Sunkin and Stringer 
(1996) Mol Microbiol 19:283-295; Wada and Nakamura (1996) DNA Res. 3:55-64; Wada et al 
(1995) J. Infect. Dis. 171:1563-1568). These studies suggest that P. carinii has developed an 
elaborate system for antigenic variation, presumably to evade host defense mechanisms. 

Molecular and immunological studies have clearly demonstrated that P. carinii isolated from 
different host species are distinct organisms, and may in fact be separate species (Gigliotti (1992) J. 
Infect. Dis. 165:329-336; Keely etal (1994) J. Eukaryot. Microbiol 41:94S; Kovacs et al (1989) J. 
Infect Dis. 159:60-70; Stringer (1993) Infect. Agents Dis. 2:109-1 17). There is a high level of 
variation among orthologous genes, including the MSG genes, isolated from different host-specific 
strains of the Pneumocystis. Hence, diagnosis of P. carinii infection in human patients ideally 
requires P. carinii sp. f. hominis (hereinafter "human-/'. cariniC) derived reagents. 

The cloning of human-P. carinii MSG genes has recently been reported (Garbe and Stringer 
(1994) Infect. Immun, 62:3092-3101; Stringer et al (1993)7. Eukaryot. Microbiol 40:821-826); 
however, only one full-length sequence was reported. 



The inventors have discovered that human-/\ carinii MSG is encoded for by a large, highly- 
conserved gene family, with a particularly conserved region of about 100 amino acids in the C- 
terminal region of the proteins. The have further discovered that direct detection or nucleic acid 
amplification (e.g., PCR amplification) of human-P. carinii MSG-encoding genes provides a 
particularly sensitive and specific technique for the detection of P. carinii, and the diagnosis of PCP. 

This invention encompasses the purified novel human-P. carinii proteins represented by 
SEQ ID NO: 2, SEQ ID NO: 4, SEQ ID NO: 6, SEQ ID NO: 8, SEQ ID NO: 10, SEQ ID NO: 12, 
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and SEQ ID NO: 14, and isolated nucleic acid molecules that encode these proteins. Specific nucleic 
acid molecules encompassed in this invention include those represented in SEQ ID NO: 1 ; SEQ ID 
NO: 2; SEQ ID NO: 3; SEQ ID NO: 4, SEQ ID NO: 5; SEQ ID NO: 6, SEQ ID NO: 7; SEQ ID NO: 
15; and SEQ ID NO: 17. Also encompassed within this invention are the isolated nucleic acid 
5 sequences that encode the carboxy-terminal conserved about 100 amino acids of the disclosed 

human-/', carinii MSGs; these may be used for amplification or as probes. The sequences of these 
conserved nucleic acid molecule regions include residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 
2758-3006 of HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSG 11 (SEQ ID NO: 5), 2839-3084 of 
HMSG14 (SEQ ID NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ 

10 ID NO: 1 1), 2821-3072 of H MSG 3 5 (SEQ ID NO: 13), or 1-249 of HMSGp2 (SEQ ID NO: 15). In 

addition, this invention encompasses sequences with at least 70% sequence identity to these regions, 
and recombinant vectors comprising such nucleic acid molecules and conserved regions from within 
such nucleic acid molecules, as well as transgenic cells including such a recombinant vector. 
Another aspect of this invention provides a method of detecting the presence of 

15 Pneumocystis carinii in a biological specimen, by amplifying with a nucleic acid amplification 

method (e.g., the polymerase chain reaction) a human-/*, carinii nucleic acid sequence using two or 
more oligonucleotide primers derived from a human-P. carinii MSG protein encoding sequence, then 
determining whether an amplified sequence is present. In a preferred embodiment of this invention, 
the human-P. carinii nucleic acid sequence is a highly conserved region within an MSG-protein 

20 encoding sequence. Such a highly conserved region may, for instance, include residues 2894-3042 of 

HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSG11 (SEQ 
ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 2887- 
3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 (SEQ ID NO: 13), or 1-249 of HMSGpl 
(SEQ ID NO: 15). A further aspect of this invention is the method of detecting the presence of 

25 Pneumocystis carinii in a biological specimen, by determining whether an amplified sequence is 

present, for instance by electrophoresis and staining of the amplified sequence, or hybridization to a 
labeled probe of the amplified sequence. Appropriate labels for the hybridization probe include a 
fluorescent molecule, a chemiluminescent molecule, an enzyme, a co-factor, an enzyme substrate, or 
a hapten. The nucleotide sequence of such a probe can be chosen from any MSG gene sequence that 

30 is amplified in the detection method, and for instance can include a nucleic acid sequence according 

to SEQ ID NO: 19. 

Another aspect of this invention is a method of detecting the presence of Pneumocystis 
carinii in a biological specimen by exposing the biological specimen to a probe that hybridizes to a 
human-/*, carinii nucleic acid sequence derived from a human-P. carinii MSG protein encoding 
35 sequence. The labeled probe to be used in this method may, for instance, include the nucleic acid 

sequence of SEQ ID NO: 19. 

This invention also encompasses one or more oligonucleotide primers including at least 1 5, 
or at least 20, 25, 30, 35, 40, 50, or 100, contiguous nucleotides from any of the highly conserved 
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regions within an MSG-protein encoding sequence disclosed herein, or from any nucleic acid 
sequences having at least 70%, or at least 90% or 95%, sequence homology with these sequences. 
Specific examples of such oligonucleotide primer sequences are shown in SEQ ID NO: 17, SEQ ID 
NO: 1 8, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID NO: 23. and SEQ ID NO: 24. Of these primers, 
5 SEQ ID NO: 17, SEQ ID NO: 1 8, SEQ ID NO: 19, and SEQ ID NO:23 may serve as upstream 

primers, while SEQ ID NO: 20 and SEQ ID NO: 24 may serve as down stream primers. 

Kits for detection of a human-f*. carinii nucleic acid sequence are another aspect of this 
invention. Such kits may include at least a pair of primers each comprising at least 1 5, or at least 20, 
25, 30, 35, 40, 45, 50, or 100 contiguous nucleotides of any of the conserved regions of the herein 
10 disclosed MSG-encoding sequences, and homologs having at least 70% identity with such sequences. 

Representative primers include those represented by the nucleotide sequences of SEQ ID NO: 17; 
SEQ ID NO: 1 8; SEQ ID NO: 1 9; SEQ ID NO: 20; SEQ ID NO: 2 1 ; SEQ ID NO: 22; SEQ ID NO: 
23; and SEQ ID NO: 24. These kits may further including a positive nucleic acid amplification {e.g., 
PGR) control sequence. 

15 Antibodies raised to the peptide sequence according to SEQ ID NO: 25 or SEQ ID NO: 26 

are also included within the scope of this invention. 

The foregoing and other objects, features, and advantages of the invention will become more 
apparent from the following detailed description of several embodiments, which proceeds with 
reference to the accompanying figure and tables. 

20 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 A-1M is an alignment of the deduced amino acid sequences encoded by two of the 
human -P. carinii MSG genes contained in the genomic clone (HMSGpI, SEQ ID NO: 2; and 
HMSGp3, SEQ ID NO: 4) and the five genes generated by PCR (HMSG11, SEQ ID NO: 6; 

25 HMSG14, SEQ ID NO: 8; H MSG 32, SEQ ID NO: 10; HMSG33, SEQ ID NO: 12 and HMSG35, SEQ 

ID NO: 14), together with a published sequence (GBHMSG) and a rat-/\ carinii MSG sequence 
(RMSGGP3, GenBank accession number: L05906). A methionine was substituted for valine at 
position 1 in the PCR clones during amplification to facilitate expression, and thus is excluded from 
the alignment. The peptides that were synthesized and used to generate anti-peptide antibodies are 

30 shaded in light grey in Figure 1L (conserved epitope) or dark grey (HMSG 3 2- specific epitope). The 

arrows (Figure 1L) flank the conserved region that was expressed in pET28a. The conserved 
carboxy-terminal region of the proteins is boxed (Figure 1L). 



SEQUENCE LISTING 

35 The nucleic and amino acid sequences listed in the accompanying sequence listing are 

shown using standard letter abbreviations for nucleotide bases, and three letter code for amino acids. 
Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood 
as included by any reference to the displayed strand. 
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SEQ ID NO: 1 shows the nucleic acid sequence of MSG HMSGpl, GenBank Accession No: 
AF038556. 

SEQ ID NO: 2 shows the amino acid sequence of MSG protein HMSGpl. 
SEQ ID NO: 3 shows the nucleic acid sequence of MSG HMSGp3, GenBank Accession No: 
5 AF038556. 

SEQ ID NO: 4 shows the amino acid sequence of MSG protein HMSGp3. 
SEQ ID NO: 5 shows the nucleic acid sequence of MSG HMSGU, GenBank Accession No: 
AF033208. 

SEQ ID NO: 6 shows the amino acid sequence of MSG protein HuMSGl 1 . 
1 0 SEQ ID NO: 7 shows the nucleic acid sequence of MSG HMSG14, GenBank Accession No: 

AF033209. 

SEQ ID NO: 8 shows the amino acid sequence of MSG protein HuMSG14. 
SEQ ID NO: 9 shows the nucleic acid sequence of MSG HMSG32, GenBank Accession 
No: AF033212, 

15 SEQ ID NO: 10 shows the amino acid sequence of MSG protein HuMSG32. 

SEQ ID NO: 11 shows the nucleic acid sequence of MSG HMSG33, GenBank Accession 
No:AF033210. 

SEQ ID NO: 12 shows the amino acid sequence of MSG protein HuMSG33. 

SEQ ID NO: 13 shows the nucleic acid sequence of MSG HMSG35, GenBank Accession 
20 No:AF033211. 

SEQ ID NO: 14 shows the amino acid sequence of MSG protein HMSG35. 

SEQ ID NO: 15 shows the nucleic acid sequence of the conserved carboxy-terminal portion 
of MSG HMSGp2, GenBank Accession Number: AF038556. 

SEQ ID NO: 16 shows the amino acid sequence of the conserved carboxy-terminal portion 
25 of MSG protein HMSGp2. 

SEQ ID NO: 17 shows oligonucleotide JKK14 (upstream primer). 

SEQ ID NO: 1 8 shows oligonucleotide JKK15 (upstream primer). 

SEQ ID NO: 19 shows oligonucleotide JKK16 (internal probe). 

SEQ ID NO: 20 shows oligonucleotide JKK17 (downstream primer). 
30 SEQ ID NO: 21 shows oligonucleotide JK151 (upstream cloning primer). 

SEQ ID NO: 22 shows oligonucleotide JK152 (downstream cloning primer). 

SEQ ID NO: 23 shows oligonucleotide JK451 (upstream C-terminal cloning primer). 

SEQ ID NO: 24 shows oligonucleotide JK452 (downstream C-terminal cloning primer). 

SEQ ID NO:25 shows the amino acid sequence of the internal peptide used to generate 
35 antibodies. 

SEQ ID NO: 26 shows the amino acid sequence of the C-terminal peptide used to generate 
antibodies. 
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DETAILED DESCRIPTION OF THE INVENTION 

I. , Abbreviations and Definitions 

A. Abbreviations 

PCP: Pneumocystis carinii pneumonia (pneumocystosis) 
MSG: major surface glycoprotein 

human-P. carinii: P. carinii sp. f. hominis, human-derived Pneumocystis carinii 

B. Definitions 

Unless otherwise noted, technical terms are used according to conventional usage- 
Definitions of common terms in molecular biology may be found in Benjamin Lewin, Genes V, 
published by Oxford University Press, 1994 (ISBN 0-19-854287-9); Kendrew et al (eds.), The 
Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0-632-02182- 
9); and Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk 
Reference, published by VCH Publishers, Inc., 1995 (ISBN 1-56081-569-8). 

In order to facilitate review of the various embodiments of the invention, the following 
definitions of terms are provided: 

Biological Specimen: A biological specimen is a sample of bodily fluid or tissue used for 
laboratory testing or examination. As used herein, biological specimens include all clinical samples 
useful for detection of microbial infection in subjects. 

Appropriate tissue samples may be taken from the oropharyngeal tract, for instance from 
lung or bronchial tissue. Samples can be taken by biopsy or during autopsy examination, as 
appropriate. Biological fluids include blood, derivatives and fractions of blood such as serum, and 
fluids of the oropharyngeal tract, such as sputum. 

Examples of appropriate specimens for use with the current invention for the detection of P. 
carinii include conventional clinical samples, for instance blood or blood-fractions (e.g., serum), and 
bronchoalveolar lavage (BAL), sputum, and induced sputum samples. Techniques for acquisition of 
such samples are well known in the art. Blood and blood fractions (e.g., serum) can be prepared in 
traditional ways. Oropharyngeal tract fluids can be acquired through conventional techniques, 
including sputum induction, bronchoalveolar lavage (BAL), and oral washing. Oral washing 
provides an excellent, non-invasive technique for acquiring appropriate samples to be used in nucleic 
acid amplification (e.g., PGR) of human-P. carinii MSG sequences. Obtaining a sample from oral 
washing involves having the subject gargle with an amount normal saline for about 10-30 seconds 
and then expectorate the wash into a sample cup. 

cDNA (complementary DNA): A piece of DNA lacking internal, non-coding segments 
(introns) and transcriptional regulatory sequences. cDNA may also contain untranslated regions 
(UTRs) that are responsible for translational control in the corresponding RNA molecule. cDNA is 
synthesized in the laboratory by reverse transcription from messenger RNA extracted from cells. 
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Isolated: An "isolated" biological component (such as a nucleic acid molecule, protein or 
organelle) has been substantially separated or purified away from other biological components in the 
cell of the organism in which the component naturally occurs, i.e., other chromosomal and extra- 
chromosomal DNA and RNA, proteins and organelles. Nucleic acids and proteins that have been 
"isolated" include nucleic acids and proteins purified by standard purification methods. The term 
also embraces nucleic acids and proteins prepared by recombinant expression in a host cell as well as . 
chemically synthesized nucleic acids. 

Oligonucleotide: A linear polynucleotide sequence of between 10 and 100 nucleotide bases 
in length. 

Operably linked: A first nucleic acid sequence is operably linked with a second nucleic 
acid sequence when the first nucleic acid sequence is placed in a functional relationship with the 
second nucleic acid sequence. For instance, a promoter is operably linked to a coding sequence if the 
promoter affects the transcription or expression of the coding sequence. Generally, operably linked 
DNA sequences are contiguous and, where necessary to join two protein-coding regions, in the same 
reading frame. 

ORF (open reading frame): A series of nucleotide triplets (codons) coding for amino acids 
without any internal termination codons. These sequences are usually translatable into a peptide. 

Ortholog: Two nucleic acid or amino acid sequences are orthologs of each other if they 
share a common ancestral sequence and diverged when a species carrying that ancestral sequence 
split into two species. P. carinii isolated from different host species (for instance rats and humans) 
are known to be distinct organisms, and may in fact be separate Pneumocystis species. Because of 
this, genes and proteins derived from P. carinii isolated from different host species are orthologous to 
each other (e.g., the MSG11 gene isolated from human-P. carinii (HMSG1J) would be an ortholog of 
MSG! I isolated from rat-P. carinii). Orthologous sequences are also homologous sequences. 

Probes and primers: Nucleic acid probes and primers can be readily prepared based on the 
nucleic acid molecules provided in this invention. A probe comprises an isolated nucleic acid attached 
to a detectable label or reporter molecule. Typical labels include radioactive isotopes, enzyme 
substrates, co-factors, ligands, chemiluminescent or fluorescent agents, haptens, and enzymes. Methods 
for labeling and guidance in the choice of labels appropriate for various purposes are discussed, e.g., in 
Sambrook et al (In Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, New York, 1989) 
and Ausubel et aL (In Current Protocols in Molecular Biology y Greene Publ. Assoc. and Wiley- 
Intersciences, 1992). 

Primers are short nucleic acid molecules, preferably DNA oligonucleotides 15 nucleotides or 
more in length. Primers can be annealed to a complementary target DNA strand by nucleic acid 
hybridization to form a hybrid between the primer and the target DNA strand, and then the primer 
extended along the target DNA strand by a DNA polymerase enzyme. Primer pairs can be used for 
amplification of a nucleic acid sequence, e.g., by the polymerase chain reaction (PCR) or other nucleic- 
acid amplification methods known in the art. 
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Methods for preparing and using probes and primers are described, for example, in Sambrook 
et al. (In Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, New York, 1989), Ausubel et 
al. (In Current Protocols in Molecular Biology, Greene Publ. Assoc. and Wiley-Intersciences, 1992), 
and Innis et al. (PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc., San 
5 Diego, CA, 1990). PCR primer pairs can be derived from a known sequence, for example, by using 

computer programs intended for that purpose such as Primer (Version 0.5, ©1991, Whitehead Institute 
for Biomedical Research, Cambridge, MA). One of ordinary skill in the art will appreciate that the 
specificity of a particular probe or primer increases with its length. Thus, for example, a primer 
comprising 20 consecutive nucleotides of the human-P. carinii MSG 11 gene will anneal to a target 
10 sequence, such as another MSG gene homolog from the gene family contained within a human-P. 

carinii genomic DNA library, with a higher specificity than a corresponding primer of only 15 
nucleotides. Thus, in order to obtain greater specificity, probes and primers can be selected that 
comprise 20, 25, 30, 35, 40, 50 or more consecutive nucleotides of human-P. carinii MSG gene 
sequences. 

1 5 The invention thus includes isolated nucleic acid molecules that comprise specified lengths of 

the disclosed human-P. carinii MSG gene sequences. Such molecules may comprise at least 20, 25, 30, 
35, 40 or 50 consecutive nucleotides of these sequences, and may be obtained from any region of the 
disclosed sequences. By way of example, the human-P. carinii MSG gene sequences may be 
apportioned into halves or quarters based on sequence length, and the isolated nucleic acid molecules 

20 may be derived from the first or second halves of the molecules, or any of the four quarters. The 

human-P. carinii MSG 11 gene, shown in SEQ ID NO: 3, can be used to illustrate this. The human-P. 
carinii MSG 11 gene is 3088 nucleotides in length and so may be hypothetically divided into about 
halves (nucleotides 1-1544 and 1545-3088) or about quarters (nucleotides 1-772, 773-1544, 1545-2371 
and 2372-3088), for instance. Nucleic acid molecules may be selected that comprise at least 20, 25, 30, 

25 35, 40 or 50 consecutive nucleotides of any of these portions of the human-P. carinii MSG11 gene. 

Thus, one such nucleic acid molecule might comprise at least 25 consecutive nucleotides of the region 
comprising nucleotides 2372-3088 of the disclosed human-P. carinii MSG 11 gene (SEQ ID NO: 5). 

Further nucleic acid molecules might comprise at least 15 consecutive nucleotides of the 
regions encoding the conserved carboxy-terminal portion of each human-P. carinii MSG gene. These 

30 regions comprise nucleotides 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ 

ID NO: 3), 2845-3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836- 
3081 of H MSG 32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 
(SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15), respectively. 

Recombinant: A recombinant nucleic acid is one that has a sequence that is not naturally 

35 occurring or has a sequence that is made by an artificial combination of two otherwise separated 

segments of sequence. This artificial combination can be accomplished by chemical synthesis or, 
more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic 
engineering techniques. 
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Sequence identity: The similarity between two nucleic acid sequences, or two amino acid 
sequences, is expressed in terms of the similarity between the sequences, otherwise referred to as 
sequence identity. Sequence identity is frequently measured in terms of percentage identity (or 
similarity or homology); the higher the percentage, the more similar the two sequences are. Homologs 
5 of human-P. carinii MSG proteins, and the corresponding gene sequences, will possess a relatively high 

degree of sequence identity when aligned using standard methods. This homology will be more 
significant when the proteins or gene sequences are derived from P. carinii isolated from one host 
species (i.e., two human-P. carinii MSG homologs will typically have greater sequence identity than 
that shown by one human- and one rat-P. carinii MSG ortholog). 

10 Typically, human-P. carinii MSG homologs are 74 to 91% identical at the nucleotide level 

and 63 to 88% identical at the amino acid level when comparing pairs of clones. In comparison, there 
is approximately 60% identity at the UNA level and 40% identity at the amino acid level when 
comparing a human P. carinii MSG to the rat P. carinii ortholog MSGGP3. 

Methods of alignment of sequences for comparison are well known in the art. Various 

15 programs and alignment algorithms are described in: Smith & Waterman (1981) Adv. Appl Math 2: 

482; Needleman & Wunsch (1970) J, Mol Biol 48: 443; Pearson & Lipman (1988) Proc, Natl Acad 
ScL USA 85: 2444; Higgins & Sharp (1988) Gene, 73: 237-244; Higgins & Sharp (1989) CABIOSS: 
151-153; Corpete/tf/. (1988) Nuc. Acids Res. 16, 10881-90; Huang et al (1992) Computer Appls. in 
the Biosciences 8, 155-65; and Pearson et al (1994) Meth. Mol Bio. 24,307-31. Altschul et al (1990) 

20 J. Mol Biol 215:403-410, presents a detailed consideration of sequence alignment methods and 

homology calculations. 

The NCBI Basic Local Alignment Search Tool (BLAST) (Altschul et al (1990) J. Mol Biol 
215:403-410) is available from several sources, including the National Center for Biotechnology 
Information (NCBI, Bethesda, MD) and on the Internet, for use in connection with the sequence 

25 analysis programs blastp, blastn, blastx, tblastn and tblastx. It can be accessed at 

http/ /www.ncbi.nlm.nih.gov/BLAST/ . A description of how to determine sequence identity using this 
program is available at http://www.ncbi.nlm.nih.gov/BLAST/blast help.html . For comparisons of 
amino acid sequences of greater than about 30 amino acids, the Blast 2 sequences function is employed 
using the default BLOSUM62 matrix set to default parameters, (gap existence cost of 1 1, and a per 

30 residue gap cost of 1). When aligning short peptides (fewer than around 30 amino acids), the alignment 

should be performed using the Blast 2 sequences function, employing the PAM30 matrix set to default 
parameters (open gap 9, extension gap 1 penalties). 

Other members of the gene family of the disclosed human-P. carinii MSG proteins typically 
possess at least 60% sequence identity counted over full-length alignment with the amino acid sequence 

35 of human-P. carinii MSG using the NCBI Blast 2.0, gapped blastp set to default parameters. Sequence 

identity over the about 100 C-terminal amino acids will typically be higher than 60%, for instances 
about 63%. Proteins with even greater similarity to the reference sequence will show increasing 
percentage identities when assessed by this method, such as at least 70%, at least 75%, at least 80%, at 
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least 90%, at least 95%, or at least 98% sequence identity. When less than the entire sequence is being 
compared for sequence identity, homologs will typically possess at least 75% sequence identity over 
short windows of 10-20 amino acids, and may possess sequence identities of at least 85% or at least 
90% or 95% depending on their similarity to the reference sequence. Methods for determining 
sequence identity over such short windows are described at 
http://www.ncbi.nlm.nih.gov/BLAST/blast FAQs.htmL 

One of ordinary skill in the art will appreciate that these sequence identity ranges are provided 
for guidance only; it is entirely possible that strongly significant homologs could be obtained that fall 
outside of the ranges provided. The present invention provides not only the peptide homologs that are 
described above, but also nucleic acid molecules that encode such homologs. 

An alternative indication that two nucleic acid molecules are closely related is that the two 
molecules hybridize to each other under stringent conditions. Stringent conditions are sequence- 
dependent and are different under different environmental parameters. Generally, stringent conditions 
are selected to be about 5°C to 20°C lower than the thermal melting point (Tm) for the specific 
sequence at a defined ionic strength and pH. The T m is the temperature (under defined ionic strength 
and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. Conditions for 
nucleic acid hybridization and calculation of stringencies can be found in Sambrook et al ((1989) In 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, New York) and Tijssen ((1993) 
Laboratory Techniques in Biochemistry and Molecular Biology— Hybridization with Nucleic Acid 
Probes Part I, Chapter 2, Elsevier, New York). Nucleic acid molecules that hybridize under stringent 
conditions to a human-P. carinii MSG gene sequence will typically hybridize to a probe based on either 
an entire human-P. carinii MSG gene or selected portions of the gene under wash conditions of 2x SSC 
at 50°C. A more detailed discussion of hybridization conditions is presented below. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
similar amino acid sequences, due to the degeneracy of the genetic code. It is understood that 
changes in nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid 
molecules that all encode substantially the same protein. 

Specific binding agent: An agent that binds substantially only to a defined target. Thus an 
MSG protein-specific binding agent binds substantially only the MSG protein. As used herein, the term 
"MSG protein specific binding agent" includes anti- MSG protein antibodies and other agents that bind 
substantially only to the MSG protein. 

Anti-MSG protein antibodies may be produced using standard procedures described in a 
number of texts, including Harlow and Lane {Antibodies, A Laboratory Manual, CSHL, New York, 
1988). The determination that a particular agent binds substantially only to the MSG protein may 
readily be made by using or adapting routine procedures. One suitable in vitro assay makes use of the 
Western blotting procedure (described in many standard texts, including Harlow and Lane {Antibodies, 
A Laboratory Manual, CSHL, New York, 1988)). Western blotting may be used to determine that a 
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given MSG protein binding agent, such as an anti-MSG protein monoclonal antibody, binds 
substantially only to the MSG protein. 

Shorter fragments of antibodies can also serve as specific binding agents. For instance, FAbs, 
Fvs, and single-chain Fvs (SCFvs) that bind to MSG would be MSG-specific binding agents. 

Transformed: A transformed cell is a cell into which has been introduced a nucleic acid 
molecule by molecular biology techniques. As used herein, the term transformation encompasses all . 
techniques by which a nucleic acid molecule might be introduced into such a cell, including 
transfection with viral vectors, transformation with plasmid vectors, and introduction of naked DNA 
by electroporation, lipofection, and particle gun acceleration. 

Vector: A nucleic acid molecule as introduced into a host cell, thereby producing a 
transformed host cell. A vector may include nucleic acid sequences that permit it to replicate in a 
host cell, such as an origin of replication. A vector may also include one or more selectable marker 
genes and other genetic elements known in the art. 

II. Human-/*. Carina MSG Sequences 

This specification provides MSG proteins and MSG-encoding nucleic acid molecules, 
including gene sequences, derived from human-/*. carinii The prototypical MSG sequences are the 
human-/*, carinii sequences as presented herein (HMSGpl, HMSGp3, HMSGlh HMSG14, HMSG32, 
HMSG33, and HMSG 35). 



Human-/*, carinii HMSGpl, HMSGp3, HMSG1J, HMSG14, HMSG32, HMSG33, and 
HMSG35 genomic sequences are shown in SEQ ID NOS: 1, 3, 5, 7, 9, 11, and 13, respectively. The 
sequences typically encode proteins that are about 1000 to about 1030 amino acids in length (for 
instance, SEQ ID NO: 5 shows the amino acid sequence of the MSG1 1 protein, which is 1028 amino 
acids long). These human-/*, carinii MSG proteins show significant sequence similarity to each 
other, and a lesser degree of sequence similarity to MSG proteins derived from organisms in other 
hosts. 

With the provision herein of seven novel human-/*, carinii MSG gene sequences, nucleotide 
amplification methods, for instance polymerase chain reaction (PCR), may now be utilized as a 
preferred method for producing nucleic acid sequences encoding these human-/*, carinii MSG 
proteins. For example, PCR amplification of the human-/*, carinii MSG 11 gene sequence may be 
accomplished by direct PCR from a clinical sample. Methods and conditions for direct PCR are 
known in the art and are described in Innis et al (PCR Protocols, A Guide to Methods and 
Applications, Academic Press, Inc., San Diego, CA, 1990). Appropriate sampling methods are 
described more fully below. 

The selection of amplification primers will be made according to the portions of the gene 
that are to be amplified. Primers may be chosen to amplify small segments of the gene, the open 
reading frame, or the entire gene sequence. Variations in amplification conditions may be required to 



a. 



Human-/*, carinii HMSGpl, HMSGp3, HMSGU, H MSG 14^ 
HMSG32, HMSG33, and HMSG35 
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accommodate primers of differing lengths; such considerations are well known in the art and are 
discussed in Innis et al. (PCR Protocols, A Guide to Methods and Applications, Academic Press, Inc., 
San Diego, CA, 1 990), Sambrook et al. (In Molecular Cloning: A Laboratory Manual, Cold Spring 
Harbor, New York, 1989), and Ausubel et al. (In Current Protocols in Molecular Biology, Greene 
Publ. Assoc. and Wiley-Intersciences, 1992). By way of example only, the human-/ 5 , carinii 
HMSG11 gene as shown in SEQ ID NO: 5 can be amplified using the following combination of 
primers: 

primer JK151: 5' TTT CAT ATG GCG CGG GCG GTC AAG CGG CAG 3' (SEQ ID NO: 

21) 

primer JK152: 5' CTA AAT CAT GAA CGA AAT AAC CAT TGC TAC 3' (SEQ ID NO: 

22). 

The sequence encoding the conserved carboxy-terminal region of human-P. carinii HMSG 11 can be 
amplified using the following primer pair: 

primer JKK14: 5' GAA TGC AAA TCC TTA CAG ACA ACA G 3' (SEQ ID NO: 17) 
primer JKK17: 5* AAA TCA TGA ACG AAA TAA CCA TTG C 3' (SEQ ID NO: 20). 

These primers are illustrative only; one skilled in the art will appreciate that many different primers 
may be derived from the provided MSG gene sequences in order to amplify particular regions of these 
molecules. Resequencing of PCR products obtained by these amplification procedures is 
recommended; this will facilitate confirmation of the amplified sequence and will also provide 
information on natural variation on this sequence in different ecotypes and plant populations. 
Oligonucleotides derived from the human-P. carinii MSG gene sequences provided may be used in 
such sequencing methods. 

Further homologous human-P. carinii MSGs can be cloned in a similar manner. In order to 
increase the number of MSGs that can be amplified in a single PCR reaction, a third primer can be 
added. For instance, a second upstream primer (e.g., primer JKK15: 5' GAA TGC AAA TCT TTA 
CAG ACA ACA G 3' (SEQ ID NO: 18)) may be added to the amplification reaction along with 
primers JKK14 and JKK17. Typically, when more than two primers are provided in a single PCR 
amplification reaction, those primers that anneal to the same site on the target nucleotide sequence 
(e.g., JKK14 and JKK15) will be provided in equimolar amounts (for instance, 0.625 pM each), and 
such that the total amount of primer provided for each end of the amplicon will be equivalent (for 
instance, 1 .25 pM each). 

Oligonucleotides that are derived from the human-P. carinii HMSGpl, HMSGp3, HMSG11, 
HMSG14, HMSG32, HMSG33, and HMSG35 gene sequences (SEQ ID NOS: 1,3,5, 7, 9, 1 1, and 13, 
respectively), as well as the fragment of HMSGp2 disclosed (SEQ ID NO: 15), are encompassed 
within the scope of the present invention. Preferably, such oligonucleotide primers will comprise a 
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sequence of at least 1 5-20 consecutive nucleotides of the relevant human-P. carinii MSG gene 
sequence. To enhance amplification specificity, oligonucleotide primers comprising at least 25, 30, 
35, 40, 45 or 50 consecutive nucleotides of these sequences may also be used. These primers for 
instance may be obtained from any region of the disclosed sequences. By way of example, human-/*. 
carinii MSG gene sequences may be apportioned into halves or quarters based on sequence length, 
and the isolated nucleic acid molecules may be derived from the first or second halves of the 
molecules, or any of the four quarters. In addition, primers may be specifically chosen from the 
conserved carboxy-terminal region of each MSG coding sequence. This region comprises nucleic 
acid residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID NO: 3), 



2845-3090 oiHMSGll (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836-3081 of 
HMSG32 (SEQ ID NO: 9), 2887-3 132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 (SEQ 
ID NO: 13), and 1-249 of HMSGpl (SEQ ID NO: 15). 



With the provision of human-P. carinii HMSGpl, HMSGp3, HMSG1 1, HMSG14, 
HMSG32, HMSG33, and HMSG35 proteins and corresponding gene sequences herein, the creation 
of variants of these sequences is now enabled. 

Variant MSG proteins include proteins that differ in amino acid sequence from the human-P. 
carinii MSG sequences disclosed but that share at least 63% amino acid sequence homology (for 
example at least 80%, 90%, 95% or 98% homology) with any of the provided human MSG proteins. 
Such variants may be produced by manipulating the nucleotide sequence of the, for instance, human- 
P. carinii HMSG11 gene using standard procedures, including for instance site-directed mutagenesis 
or PCR. The simplest modifications involve the substitution of one or more amino acids for amino 
acids having similar biochemical properties. These so-called conservative substitutions are likely to 
have minimal impact on the activity of the resultant protein. Table 1 shows amino acids that may be 
substituted for an original amino acid in a protein, and which are regarded as conservative 
substitutions. 



b. 



MSG Sequence Variants 
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Table 1. 

Conservative Substitutions 





Ala 


ser 




Arg 


lys 


5 


Asn 


gin; his 




Asp 


glu 




Cys 


ser 




Gin 


asn 




Glu 


asp 


10 


Gly 


pro 




His 


asn; gin 




He 


leu; val 




Leu 


ile; val 




Lys 


arg; gin; glu 


15 


Met 


leu; ile 




Phe 


met; leu; tyr 




Ser 


thr 




Thr 


ser 




Trp 


tyr 


20 


Tyr 


trp; phe 




Val 


ile; leu 



More substantial changes in enzymatic function or other protein features may be obtained by 
selecting amino acid substitutions that are less conservative than those listed in Table 1 . Such 

25 changes include changing residues that differ more significantly in their effect on maintaining 

polypeptide backbone structure (e.g., sheet or helical conformation) near the substitution, charge or 
hydrophobicity of the molecule at the target site, or bulk of a specific side chain. The following 
substitutions are generally expected to produce the greatest changes in protein properties: (a) a 
hydrophilic residue (e.g., seryl or threonyl) is substituted for (or by) a hydrophobic residue (e.g., 

30 leucyl, isoleucyl, phenylalanyi, valyl or alanyl); (b) a cysteine or proline is substituted for (or by) any 

other residue; (c) a residue having an electropositive side chain (e.g., lysyl, arginyl, or histadyl) is 
substituted for (or by) an electronegative residue (eg., glutamyl or aspartyl); or (d) a residue having a 
bulky side chain (e.g., phenylalanine) is substituted for (or by) one lacking a side chain (eg., 
glycine). 

35 Variant MSG genes may be produced by standard DNA mutagenesis techniques, for 

example, M13 primer mutagenesis. Details of these techniques are provided in Sambrook et al (In 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, New York, 1989), Ch. 15. By the 
use of such techniques, variants may be created which differ in minor ways from the human-P. carinii 
MSG gene sequences disclosed. DNA molecules and nucleotide sequences which are derivatives of 

40 those specifically disclosed herein and that differ from those disclosed by the deletion, addition, or 

substitution of nucleotides while still encoding a protein that has at least 63% sequence identity with 
the MSG sequences disclosed (SEQ ID NOS: 1, 3, 5, 7, 9, 1 1, and 13) are comprehended by mis 
invention. In their most simple form, such variants may differ from the disclosed sequences by 
alteration of the coding region to fit the codon usage bias of the particular organism into which the 

45 molecule is to be introduced. 
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Alternatively, the coding region may be altered by taking advantage of the degeneracy of the 
genetic code to alter the coding sequence such that, while the nucleotide sequence is substantially 
altered, it nevertheless encodes a protein having an amino acid sequence substantially similar to the 
disclosed human P. carinii MSG protein sequences. For example, the 2nd amino acid residue of the 
human P. carinii HMSG1 1 protein is alanine. The nucleotide codon triplet GCG encodes this alanine 
residue. Because of the degeneracy of the genetic code, three other nucleotide codon triplets - GCT, 
GCC and GCA - also code for alanine. Thus, the nucleotide sequence of the human P. carinii 
HMSG11 ORF could be changed at this position to any of these three alternative codons without 
affecting the amino acid composition or characteristics of the encoded protein. Based upon the 
degeneracy of the genetic code, variant DNA molecules may be derived from the cDNA and gene 
sequences disclosed herein using standard DNA mutagenesis techniques as described above, or by 
synthesis of DNA sequences. Thus, this invention also encompasses nucleic acid sequences which 
encode an MSG protein, but which vary from the disclosed nucleic acid sequences by virtue of the 
degeneracy of the genetic code. 

Variants of the MSG protein may also be defined in terms of their sequence identity with the 
prototype MSG proteins shown in SEQ ID NOS: 2, 4, 6, 8, 10, 12, and 14. As described above, 
human MSG proteins share at least 60% (for example, at least 63%) amino acid sequence identity 
with the human P. carinii HMSGpl, HMSGp3, HMSG11, HMSG14, HMSG32, HMSG33, or HMSG35 
proteins (SEQ ID NOS: 2, 4, 6, 8, 10, 12, and 14, respectively). Nucleic acid sequences that encode 
such proteins may readily be determined simply by applying the genetic code to the amino acid 
sequence of an MSG protein, and such nucleic acid molecules may readily be produced by 
assembling oligonucleotides corresponding to portions of the sequence. 

Nucleic acid molecules that are derived from the human P. carinii MSG gene sequences 
disclosed include molecules that hybridize under stringent conditions to the disclosed prototypical 
MSG nucleic acid molecules, or fragments thereof. Stringent conditions are hybridization at 65°C in 
6 x SSC, 5 x Denhardt's solution, 0.5% SDS and 100 jig sheared salmon testes DNA, followed by 
15-30 minute sequential washes at 65°C in 2 x SSC, 0.5% SDS, followed by 1 x SSC, 0.5% SDS and 
finally 0.2 x SSC, 0.5% SDS. 

Low stringency hybridization conditions (to detect less closely related homologs) are 
performed as described above but at 50°C (both hybridization and wash conditions); however, 
depending on the strength of the detected signal, the wash steps may be terminated after the first 2 x 
SSC wash. 

Human-/*, carinii HMSGpl, HMSGp3, HMSG1J, HMSG14, HMSG32, HMSG33, and 
HMSG35 genes (SEQ ID NOS: 1, 3, 5, 7, 9, 1 1 and 13), as well as the fragment of HMSGp2 
disclosed (SEQ ID NO: 15), and homologs of these sequences may be incorporated into 
transformation or expression vectors. 
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III. Detection of J* Carinii In Clinical Specimens 

The conserved nature of human-/\ carinii MSG genes provided in this specification, and 
particularly the highly-conserved about 100 amino acid region in the C-terminal portion of the 
protein, makes these genes useful targets for use in detection of P. carinii in clinical samples and 
diagnosis of PCP. 

a. Clinical Specimens 

Appropriate specimens for use with the current invention in detection of P. carinii include 
any conventional clinical samples, for instance blood or blood-fractions {e.g., serum), and 
bronchoalveolar lavage (BAL), sputum, and induced sputum samples. Techniques for acquisition of 
such samples are well known in the art. See, for instance, Schluger et al (J. Exp. Med 176:1327- 
1333) (collection of serum samples); Bigby etal (Am. Rev. Respir. Dis. 133:515-518, 1986) and 
Kovacs et al (NEJM 31 8:589-593, 1988) (collection of sputum samples); and Ognibene et al. (Am. 
Rev. Respir. Dis. 129:929-932,1984) (collection of bronchoalveolar lavage (BAL). 

In addition to conventional methods, oral washing provide an excellent, non-invasive 
technique for acquiring appropriate samples to be used in nucleic acid amplification (e.g, PCR) of 
human-P. carinii MSG sequences (Helweg-Larsen et al (1998) J. Clin. Microbiol. 36:2068-2072). 
Oral washing involves having the subject gargle with 50 cc of normal saline for 10-30 seconds and 
then expectorate the wash into a sample cup. 

Serum or other blood fractions can be prepared in the conventional manner. About 200 u.L 
of serum is an appropriate amount for the extraction of DNA for use in amplification reactions. See 
also, Schluger et al, (1992) J, Exp. Med. 176:1327-1333; Ortona et al, (1996) Mol Cell Probes 
10:187-90. 

Once a sample has been obtained, DNA can be extracted through any conventional method. 
For instance, rapid DNA preparation can be performed using a commercially available kit (e.g., the 
InstaGene Matrix, BioRad, Hercules, CA; the NucliSens isolation kit, Organon Teknika, 
Netherlands). Preferably the DNA preparation technique chosen yields a nucleotide preparation that 
is accessible to and amenable to nucleic acid amplification. 

b. Direct Hybridization Probing Detection 

Human-P. carinii MSG gene sequences can be detected through the hybridization of an 
oligonucleotide probe to nucleic acid molecules prepared from a clinical sample. The sequence of 
appropriate oligonucleotide probes will correspond to a region within one or more of the human-P. 
carinii MSG sequences disclosed herein. Techniques for use in hybridization of oligonucleotide 
probes to target sequences will be known to one of ordinary skill in the art. See, for instance, U.S. 
Patent Nos. 5,164,490 (disclosing use of sequences from the P. carinii dihydrofolate reductase gene 
as direct hybridization probes) and 5,519,127 (using nucleic acid probes capable of hybridizing to 
rRNA or rDNA of P. carinii for detection of the organism). In general, hybridization probes will be 
at least 1 5 bases in length, and may be 20, 25, 30, 35, 40 or 50 or more bases in length. For instance, 
a probe may comprise the entire conserved sequence of an MSG (e.g., residues 2845-3090 of 
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HMSGll), or the entire coding sequence of the gene. Typically such a probe will be detectably 
labeled in some fashion, either with an isotopic or non-isotopic label. Such non-isotopic labels may, 
for instance, comprise a fluorescent or luminescent molecule, or an enzyme, co-factor, enzyme 
substrate, or hapten. Hie probe is generally incubated with a single-stranded preparation of DNA, 
5 RNA, or a mixture of both, and hybridization determined after separation of double and single- 

stranded molecules. Alternatively, probes may be incubated with a nucleotide preparation after it has 
been separated by size and/or charge and immobilized on an appropriate medium. Hybridization 
techniques suitable for use with oligonucleotides are well known to those of ordinary skill in the art. 
For general references on the conditions and options that are appropriate, see Sambrook et al (1989) 
10 Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, New York, and Ausubel et al 

(1992) In Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley- 
Intersciences. 

c. Nucleic Acid-Mediated Detection 

It may be advantageous to amplify target P. carinii gene sequences in a clinical sample prior 

15 to using a hybridization probe to detect its presence. For instance, for detection of human-P. carinii 

MSG gene sequences, it may be advantageous to amplify part or all of the MSG gene sequence, then 
detect the presence of the amplified sequence pool. Any nucleic acid amplification method can be 
used, including polymerase chain reaction (PCR) amplification. Amplification can be carried out in a 
simple single reaction using a pair of primers, or can be enhanced by the use of multiple degenerate 

20 primers to increase the number of MSG homologs that are amplified. Where degenerate primers are 

used, the sequence variability of the disclosed human-P carinii MSG gene sequences can be used to 
design appropriate primers that will be specific for multiple human P. carinii MSG homologs. 
Alternately, amplification specificity can be increased through the use of nested PCR techniques, 
which are known (see, for instance, Lipschik et al (1992) Lancet 340:203-206, using nested sets of 

25 primers to rRNA in the detection of Pneumocystis carinii). 

It is also possible to run sequential PCR amplification experiments on samples using 
different targets in each reaction, such that putative positive samples detected in the first reaction are 
confirmed by amplification of a second sequence. For instance, it would be possible to analyze 
clinical samples through PCR amplification of a human-P. carinii MSG gene, then to take only those 

30 samples that are positive for amplification of MSG and test them also for the presence of P. carinii 

rRNA, for instance. Such sequential testing of samples will help reduce false positive results due to 
cross contamination of PCR samples; it is unlikely that a clinical sample will become contaminated 
with both target sequences. 

The selection of PCR primers will be made according to the portions of the gene sequence 

35 that are to be amplified. For use in PCR detection of P. carinii, it is advantageous to choose primer- 

annealing sites that are highly conserved across many different members of the human-P. carinii 
MSG gene family For instance, it is advantageous to choose primer sites from within the regions of 
human-P. carinii sequence displaying greater than 63% sequence identity across the disclosed family 
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members, e.g., that portion of the gene encoding the conserved carboxy-terminal region of the 
protein. The highly conserved carboxy-terminal regions of the disclosed genes are as follows: 
residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID NO: 3), 2845- 
3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836-3081 of HMSG32 
5 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 (SEQ ID NO: 

13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

Variations in amplification conditions may be required to accommodate primers of differing 
lengths; such considerations are well known in the art and are discussed in Sambrook et al. ((1989) In 
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, New York) and Ausubel et al. (In 

10 Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences, 

1992). By way of example only, primers JKK14, JKK15, and JKK17 (SEQ ID NOS: 17, 18, and 20 
respectively) can be used to amplify the C-terminal conserved region of several human-P. carinii 
MSG genes. These primers are illustrative only; one skilled in the art will appreciate that many 
different primers may be derived from the provided cDNA and gene sequences in order to amplify 

15 particular regions of these molecules. 

Oligonucleotides to be used in detection of the P. carinii organism or diagnosis of PCP that 
are derived from the human-P. carinii MSG gene sequences disclosed herein are encompassed within 
the scope of the present invention. 

d. Detection of Amplified P. carinii MSG sequences 

20 The presence of amplified human-P. carinii MSG sequences can be determined in any 

conventional manner, including electrophoresis and staining (for instance, with ethidium bromide) of 
the amplified sequence, or hybridization of a labeled probe to the amplified sequence. For general 
guidelines on such techniques, see Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, 
New York (1989), and Current Protocols in Molecular Biology, Greene Publishing Associates and 

25 Wiley-Intersciences (1987). Hybridization probes appropriate for use in detection of amplified 

human-P. carinii MSG sequences are essentially equivalent to those described above for direct 
hybridization. The region of the gene that has been amplified will be important in choosing an 
appropriate probe; the detection probe should hybridize to a sequence that falls between the ends of 
the amplification primers such that the annealing site of the probe is amplified. By way of example, 

30 one appropriate oligonucleotide probe is JKK16 (SEQ ID NO: 19), which corresponds to residues of 

3004-3029 of HMSG33. This probe could be used for detection of both full-length and carboxy- 
terminal amplified fragments of human-P. carinii MSG genes. 

Typically, oligonucleotide probes will be labeled as discussed above, and detection will be 
carried out through conventional methods. In general, detection of amplified sequences will be more 

35 sensitive than direct hybridization. 

In addition to radioisotope labeled hybridizing probes, amplicons can be detected using 
fluorescent labeled probes. One such appropriate fluorescent label is europium (Eu 3+ ). See, for 
instance, Lopez et al. (1993) Clin Chem. 39(2): 196-201 (using a europium derivative fortune- 
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resolved fluorescence detection of amplified human papillomavirus sequences); Eskola et al. (1994) 
Clin. Biochem. 27(5):373-379 (using PCR and europium-labeled DNA probes to detect a marker for 
chronic myelogenous leukemia); and Dahlen et al. (1991) J. Clin. Microbiol. 29(4):798-804 
(detection of PCR amplified HIV sequences using biotinylated and europium labeled oligonucleotide 
probes). 

e. Preparation of a Positive Nucleic 

Acid Amplification Control 

It is advantageous to provide a positive control sequence for use in nucleic acid 

amplification reactions, to ensure that the system is functioning properly. The positive control 

sequence should be one the provided oligonucleotide primers are known to anneal to. Therefore, in 

the present invention, appropriate positive control sequences include, for instance, any sequences that 

can be amplified with the same primers as are used to amplify human-P. carinii MSG. For instance, 

primers JKK14 (SEQ ID NO: 17) and JKK17 (SEQ ID NO: 20) can serve as appropriate primers. It 

is advantageous, however, if the internal amplified sequence is distinguishable from the MSG target 

(/.e., is a mimic rather than identical sequence); this allows specific and separate detection of the 

target and mimic amplified products. Appropriate differences between the two sequences include 

overall length of the amplicon (where detection of the PCR products will be performed using 

electrophoresis and subsequent staining) and amplicon sequence differences (where detection of the 

PCR products will be performed using hybridization to a labeled probe specific for each amplified 

sequence). 

Nucleic acid amplification positive control sequences can be provided in the form of 
independent, linear nucleotide sequences. Alternately, a recombinant vector comprising the 
appropriate positive control sequence may be provided. Construction of such a recombinant vector is 
by conventional means, and any of a myriad of conventional cloning vectors can be used. In general, 
the vector will include one or more restriction enzyme sites into which the PCR control sequence can 
be inserted. The vector may also comprise a replication site to provide for its production in a suitable 
host cell, for instance in a bacterial cell. The choice of appropriate cloning vector will be within the 
skill of an ordinary artisan. 

IV. Kits For Detection of P. Carinii 

The oligonucleotide primers disclosed herein can be supplied in the form of a kit for use in 
detection of P. carinii or diagnosis of PCP. In such a kit, an appropriate amount of one or more of 
the oligonucleotide primers is provided in one or more containers. The oligonucleotide primers may 
be provided suspended in an aqueous solution or as a freeze-dried or lyophilized powder, for 
instance. The container(s) in which the oligonucleotide(s) are supplied can be any conventional 
container that is capable of holding the supplied form, for instance, microfuge tubes, ampoules, or 
bottles. In some applications, pairs of primers may be provided in pre-measured single use amounts 
in individual, typically disposable, tubes or equivalent containers. With such an arrangement, the 
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sample to be tested for the presence of human-P. carinii can be added to the individual tubes and 
amplification carried out directly. 

The amount of each oligonucleotide primer supplied in the kit can be any appropriate 
amount, depending for instance on the market to which the product is directed. For instance, if the kit 
is adapted for research or clinical use, the amount of each oligonucleotide primer provided would 
likely be an amount sufficient to prime several PGR amplification reactions. Those of ordinary skill 
in the art know the amount of oligonucleotide primer that is appropriate for use in a single 
amplification reaction. General guidelines may for instance be found in Innis et al. (PCR Protocols, 
A Guide to Methods and Applications, Academic Press, Inc., San Diego, CA, 1990), Sambrook et al 
(In Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, New York, 1989), and Ausubel 
et al (In Current Protocols in Molecular Biology, Greene Publ. Assoc. and Wiley-Intersciences, 
1992). 

A kit may include more than two primers, in order to facilitate the PCR amplification of a 
larger number of human-P. carinii MSG genes. For instance, primers JKK14 (SEQ ID NO: 17) and 
JKK15 (SEQ ID NO: 18) both may be provided as upstream primers, while primer JKK17 (SEQ ID 
NO: 20) is provided as a downstream primer. These primers are provided by way of example only. 

In some embodiments of the current invention, kits may also include the reagents necessary 
to carry out PCR amplification reactions, including, for instance, DNA sample preparation reagents, 
appropriate buffers {e.g., polymerase buffer), salts (e.g., magnesium chloride), and 
deoxyribonucleotides (dNTPs). 

Kits may in addition include either labeled or unlabeled oligonucleotide probes for use in 
detection of the amplified human-P. carinii sequences. The appropriate sequences for such a probe 
will be any sequence that falls between the annealing sites of the two provided oligonucleotide 
primers, such that the sequence the probe is complementary to is amplified during the PCR reaction. 
Primer JKK16 (SEQ ID NO: 19) exemplifies such a sequence, and an appropriate probe could 
comprise this sequence. 

It may also be advantageous to provided in the kit one or more control sequences for use in the 
PCR reactions. Appropriate positive control sequences may be essentially as those discussed above. 

EXAMPLES 

Example 1: Isolation of multiple human-P. carinii 
MSG sequences. 

A. Polymerase Chain Reaction (PCR) 
Amplification Cloning 

DNA was isolated from an autopsy lung sample of an HIV-infected patient with P. carinii 
pneumonia according to standard methods, using SDS and proteinase K (0.5 ug/ml), followed by 
phenol-chloroform extraction and ethanoi precipitation (Davis et al. (1986) Basic Methods in 
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Molecular Biology, Elsevier, NY). A genomic library using the same DNA cloned into the Xho 1 site 
of lambda GEM 12 vector (Promega, Madison, WI) was commercially prepared (Lofstrand Labs 
Limited, Gaithersburg, MD). 

Primers to amplify full-length human P. carinii genes were designed 
based on published data (Garbe and Stringer (1994) Infect Immun. 62(8):3092-3101). The sense 
primer, JK151 (S'-TTT CAT ATG GCG CGG GCG GTC AAG CGG CAG-3') (SEQ ID NO: 21) 
corresponds to nucleotides 153 to 175 of a published MSG sequence (GenBank accession number 
L27092), and the antisense primer JK152 (5'-CTA AAT CAT GAA CGA A AT A AC CAT TGC 
TAC-3') (SEQ ID NO: 22) is complementary to nucleotides 3215 to 3244 of the same sequence. An 
Nde I site was created at the beginning of JK151, which substitutes a methionine for the valine of the 
original sequence, to facilitate subcloning and expression. For amplification, 1 ng of genomic DNA 
was added to a 50 \x\ reaction containing primers (25 pM each), dNTPs (0.2 mM), 5 U of AmpliTaq 
(Perkin-Elmer), and MgCl 2 (2.5 mM). The DNA amplification was performed on a Perkin Elmer 
Cetus DNA thermal cycler. An initial denaturation cycle (1 minute at 96°C) was followed by 36 
cycles of denaturation at 95°C for 1 minute, annealing at 50°C for 2 minutes and extension at 72°C for 
2 minutes, followed by a final extension after the last cycle at 72°C for 10 minutes. 

A band of the correct size (approximately 3.1 Kb) was amplified and subjected to 
electrophoresis in 1% agarose gel in IX TBE buffer. PCR products were then directly subcloned into 
PCR II (lnvitrogen, Carlsbad, CA) according to the manufacturer's instructions. Five clones that 
differed in their restriction mapping and hybridization patterns were identified and sequenced 
(HMSG11 (SEQ ID NO: 5) GenBank accession number AF033208; HMSG14 (SEQ ID NO: 7) 
number AF033209; HMSG33 (SEQ ID NO: 1 1 ) number AF0332 1 0; HMSG35 (SEQ ID NO: 1 3) 
number AF03321 1 ; and HMSG32 (SEQ ID NO: 9) number AF033212). 

Nucleotide sequencing was performed using an automated sequencer (Model 373 or 377, 
Applied Biosystems/Perkin Elmer, Foster City, CA). The nucleotide sequence and deduced amino 
acid sequence data were analyzed by Factura and AutoAssembler (both from Applied Biosystems), 
Sequencher (Gene Codes Corp., Ann Arbor, MI), MacVector (Scientific Imaging Systems, New 
Haven, CT), ClustalW (40), and GeneWorks (IntelliGenetics, Mountain View, CA). 

All clones encoded MSG variants that were clearly related but differed from each other. The 
coding region of the clones varied in length from 3,054 to 3,087 bases, encoding proteins of 1,008 to 
1,028 amino acids with predicted molecular weights of 1 14 to 1 17 KDa. They are 74 to 91% 
identical at the nucleotide level and 63 to 88% identical at the amino acid level when comparing pairs 
of clones. Overall, approximately 50% of the amino acids are conserved in all five clones. The 
clones are more closely related to each other than to rat P. carinii MSG genes. There is an 
approximately 60% identity at the DNA level and 40% identity at the amino acid level when 
comparing a human P. carinii MSG to rat P. carinii MSGGP3, 
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B. Southern hybridization/Library 
screening 

For southern hybridization with a radioactive probe, DNA was treated with restriction 
enzymes, separated by agarose gel electrophoresis and transferred to Hybond N+ membranes 
(Amersham, Life Science, Arlington Heights, IL) with 0.4 M NaOH. DNA was probed using an 
approximately 600 bp Xba I fragment of the human P. carinii MSG III gene (Garbe and Stringer 
( 1 994) Infect. Immuno. 62:3092-3 1 0 1 ) that had been labeled with a-32P dATP or a-32P dCTP by a 
random priming kit (Boehringer Mannheim). Filters were prehybridized for 4 hours and then 
hybridized overnight at 55°C in 6X SSPE with 0.5% SDS, and 5X Denhardt's solution. Blots were 
washed in 6X SSPE with 0.5% SDS at room temperature for 10 minutes and then in 0.5X SSPE with 
0.5% SDS at 55°C twice for 30 minutes each. The genomic library was screened using a gel-purified 
full-length fragment oiHMSGH under the same conditions as above. One clone that hybridized 
strongly to the probe was subcloned into the Bam HI site of pBluescript II (Stratagene, La Jolla, CA). 
This 12,792 bp clone (GenBank accession number AF038556) contained three full-length and one 
partial MSG sequences in a head to tail tandem arrangement, similar to what has previously been 
reported (Garbe and Stringer (1994) Infect Immun. 62:3092-3101 ; Stringer et al (1993) J. Eukaryot 
Microbiol 40:821-826). One of the full-length MSG sequences did not have a complete open reading 
frame due to a frame shift between bases 6290 and 6347. The codon corresponding to a methionine 
at the beginning of rat P. carinii MSG clones encoded a valine in all the open reading frames, 
consistent with earlier observations (Garbe and Stringer (1994) Infect Immun. 62:3092-3101; 
Stringer et al. (1993) J. Eukaryot Microbiol 40:821-826). Nucleotide sequencing was performed as 
above. 

Example 2: Characterization of Human-/*, carinii 
MSG Proteins 

Figure 1 shows an alignment of the predicted proteins encoded by the full length MSG genes 
cloned by PCR(MSG1 1, 14, 32, 33, and 35) and Southern (MSGpl and p3), together with previously 
published a human (Garbe and Stringer (1994) Infect Immun. 62:3092-3 101) and rat P. carinii MSG 
sequence (GenBank accession number L05906). Among the human-/*, carinii MSG sequences, there 
is substantial variability downstream of the amino-terminus, while the region near the carboxyl 
terminus is highly conserved. For example, there is 63% identity in the last 100 amino acids among 
all the genes (excluding the region encoded by the PCR primer JK152), which is about five times as 
high as the conservation among the first 100 amino acids (13% excluding the primer region 
corresponding to primer JK151). Like most known genes of P. carinii, all human P. carinii MSG 
genes show a strong AT bias, especially in the third position (approximately 70% A or T) (Edman et 
al (1989) Proc. Natl Acad Sci. USA. 86:8625-8629; Garbe and Stringer (1994) Infect Immun. 
62:3092-3101; Kovacs etal (1993) J. Biol Chem. 268:6034-6040; Wadaef a/. (1993) J. Infect. Dis. 
168:979-985). As in other MSG molecules, cysteine residues of the human P, carinii MSG 
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molecules are relatively numerous (5.7 to 5.9%) and are highly conserved: 96% of all the cysteine 
residues present in the human-P. carinii MSG clones are conserved in all the clones. When 
comparing HuMSGll to rat P. carinii MSG clone GP3, 94% of cysteine residues are conserved. The 
cysteine residues are unevenly distributed in four main regions and often show a pattern of two 
cysteines separated by 6 to 7 amino acids, similar to what is seen in rat P. carinii (Kovacs et al. 
(1993) J. Biol Chem, 268:6034-6040). There is no predictable pattern to the intervening amino 
acids. All human MSG proteins share a highly conserved amino acid domain rich in threonine and 
serine residues near the carboxyl terminus. Seven to thirteen potential N-linked glycosylation sites 
(NXS/T) were observed in the MSGs. A premature stop codon was seen in MSG 32 after residue 
1008 which is most probably due to a PCR artifact resulting in a point mutation; studies using the 
ligase chain reaction with primers specific for the mutation supported this conclusion. 

A. Construction and expression of full 
length recombinant human P m carinii 
MSG 

The full-length HMSG32 gene, which contains the premature stop codon, was inserted into 
pBlueBacHis2A (Invitrogen, Carlsbad, CA) at the Eco Rl site for expression in a baculovirus insect 
cell system. Correct insertion was confirmed by restriction mapping and sequencing. Isolation of 
recombinant virus, plaque purification and amplification of high titer virus stock were performed 
according to the manufacturer's protocols (Invitrogen, Carlsbad, CA). PCR amplification using gene- 
specific primers was used to confirm the presence of the gene in the virus. Sf9 cells were grown at 
27°C in SFII-900 medium (GIBCO BRL Grand Island, NY) with 5% fetal calf serum to a density of 
2.0x1 0 6 cells/ml. Cells were infected at a multiplicity of infection (moi) of 5. Seventy-two hours 
after infection, cells were harvested by centrifugation, washed with phosphate buffered saline 
supplemented with PMSF (1 mM/ml), then resuspended in 10 mM Tris-HCl, pH 8 with 1 mM PMSF, 
and sonicated. The cell Iysates were analyzed by SDS-PAGE and western blotting. 

SDS-PAGE and western blotting were performed using standard techniques (see Kovacs et 
al (1988)7. Immunol 140:2023-2031). Electrophoresis was done in pre-poured discontinuous 8% 
and 14% acrylamide tris-glycine gels (Novex, San Diego, CA). Proteins were stained by Coomassie 
blue or transferred to nitrocellulose membranes, following which western blots were performed with 
a variety of antisera using standard techniques (Kovacs et al (1988) J. Immunol 140:2023-2031). 
Recombinant rat P. carinii HMSGp3 protein (expressed in a baculovirus system) (Mei et al (1996) J. 
Eukarot. Microbiol 43:3 IS) and purified recombinant P-galactosidase (expressed in the pET 28-£. 
coli system) were used as controls in western blotting. 

Anti-peptide antisera were commercially generated in rabbits to a peptide specific for 
HMSG32 (KMYGLFYGSGKEWFKKLLEKIM (SEQ ID NO: 25), corresponding to amino acids 
461-482) and to a conserved human-P. carinii MSG epitope contained within the recombinant 
carboxyl terminal fragment (TITSTITSKITLTST (SEQ ID NO:26) corresponding to amino acids 968 
to 982 of MSG32) by the multiple antigenic peptide system method (Posnett et al (1988) J. Biol 
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Chem. 263:1719-1725) (Research Genetics, Huntsville, AL). Anti-Xpress monoclonal antibody, 
which detects an epitope tag at the amino terminus of the fusion proteins expressed in 
pBlueBacHis2A, was purchased from Invitrogen (Carlsbad, CA). T7-tag monoclonal antibody, 
which detects an epitope tag at the amino terminus of the fusion proteins derived from PET 28A, was 
purchased from Novagen, Inc. (Madison, WI). 

A time course showed that maximal expression occurred after 60-72 hours of infection. The 
identity of the recombinant protein was confirmed by western blotting using both an antibody against 
a peptide tag present in the vector as well as an anti-peptide antibody raised against a peptide (SEQ 
ID NO: 25) specific for MSG32. No reactivity was seen when SF9 cells alone or recombinant 
baculovirus-derived rat MSG GP3 were used as the targets. Multiple bands were seen in the western 
blots, especially when using the MSG-specific anti-peptide antibody. These likely represent protein 
degradation products, or possibly modification of the recombinant protein. 

Although rat MSGGP3 could be produced at a high level in a baculovirus system, and was 
easily purified by affinity chromatograph using a nickel column (Mei et aL (1996) J. Eukarot 
Microbiol. 43:3 IS), prolonged attempts to produce and purify high levels of human P. carinii MSG 
were unsuccessful. 

B. Construction and Expression of the 
Conserved C-terminal Portion of 
Human-/* carinii MSGs 

PCR was used to amplify the conserved carboxy-terminal region of the human P. carinii 
MSG gene without the carboxyl terminus hydrophobic tail, since this hydrophobic tail could 
potentially interfere with expression and purification. Primers were designed based on the alignment 
of five new MSG genes as well as the published sequence. The sense primer was JK451 (5-GAA 
TTC GAT CTG AAG CCT CTG GAG-3') (SEQ ID NO: 23), and the antisense primer was JK452 
(5'-TTC TAG AAA CCC ACT CAT CTT CAA-3') (SEQ ID NO: 24). An Eco RI site was added to 
the sense primer and an Xba I site, which encoded an in frame stop codon, was added to the antisense 
primer to facilitate subcloning. One jig of plasmid DNA was used for PCR amplification under the 
same conditions used above for isolation of PCR clones. 

The 306 bp PCR product of carboxy-terminal region amplified from MSG33 was ligated in 
frame into pET28A (Novagen, Inc. Madison, WI) at the Eco RI site. pET28A is an expression vector 
in which a histidine tag precedes the insertion site. The presence of a six histidine (hexa-his) 
sequence in the expressed portion of the vector preceding the insert allows rapid, one-step 
purification of the recombinant protein by binding to nickel metal affinity chromatography matrix. 
Restriction mapping and sequencing were performed to confirm correct insertion. Expression was 
induced in E. coli strain BL21 (DE3) using 1 mM IPTG. Recombinant protein was solubilized with 
6M urea and purified by affinity chromatography using a nickel column according to the 
manufacturer's instructions (Novagen, Inc., Madison, WI). The sample was eluted with elution 
buffer without urea, dialyzed using 0.5X PBS to eliminate imidazole, and lyophilized for storage. 
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Recombinant protein was analyzed by SDS-PAGE and western blotting as above. High 
level expression was observed within two hours; no equivalent band was seen using pET 28A without 
insert under the same conditions. Although the yield was variable from experiment to experiment, 
typically about 7 milligrams of purified protein was obtained from a one liter culture of £ coli. The 
identity of the protein was confirmed by immunoblotting using both T7-tag monoclonal antibody and 
a polyclonal anti-epitope antibody generated in rabbits against an epitope (SEQ ID NO: 26) contained 
within the recombinant carboxyl terminal fragment. No reactivity was seen with preimmune rabbit 
serum, with uninduced E. coli extracts, or with second antibody alone. 

C. Evaluation of Human Sera Using 

Antibodies to Human-/* carinii MSG 

Human sera evaluated by immunoblotting included sera from both AIDS and non-AIDS 
patients with and without a history of P. carinii pneumonia, as well as healthy individuals. Samples 
included those from 1 1 immunosuppressed patients with recent or acute P. carinii pneumonia but 
without HIV infection, 5 patients with HIV infection and P. carinii pneumonia, 17 patients with HIV 
infection but without P. carinii pneumonia, 3 patients with neither HIV infection nor P. carinii 
pneumonia, and 13 healthy laboratory workers. Human sera were tested at a dilution of 1:100. 
Horseradish peroxidase-conjugated goat anti-human IgG, alkaline phosphatase conjugated goat anti- 
rabbit IgG and goat anti-mouse IgG (all from GIBCO BRL) or horseradish peroxidase conjugated 
goat anti-cat, anti-rat, and anti-mouse IgG (Jackson ImmunoResearch Laboratories, Inc., West Grove, 
PA) were used as second antibodies in western blotting. 

All 49 samples reacted by immunoblotting with the recombinant peptide. Because the 
recombinant peptide included a vector-derived region, a subset of 4 samples was simultaneous 
evaluated for reactivity with recombinant 0-galactosidase expressed in the same vector. None of the 
samples reacted with the recombinant p-galactosidase, demonstrating that the reactivity seen was 
against the P. carinii derived peptide region. In addition, little or no reactivity was seen when using 
rat, mouse, or cat serum. 

Example 3: Detection of Human- P. carinii 
Nucleic Acid Sequences. 

A. Preparation of a Vector Comprising A 
Control Sequence 

A mimic amplification construct containing a positive control sequence was prepared using 
the tetracycline resistance (tet R ) gene coding sequence from pBR322 (Backman and Boyer (1983) 
Gene 26:197). In order to generate a tet R gene-based amplicon that could be amplified using MSG- 
specific primers JKK14/15 and JKK17, bipartite primers were generated with two distinct annealing 
regions. The 5' region of each primer was taken from the MSG target sequences (e.g., SEQ ID NOS: 
1 7 and 20). The 3' region of each primer was designed to be specific to the tet R coding sequence. 
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Amplification using these primers generated an amplicon containing an approximately 280 base 
internal fragment of tet R coding sequence, with 25 nucleotide AfSG-specific ends. For amplification, 
1 ug of tet R coding sequence DNA was added to a 50 \i\ reaction containing primers (25 pM each), 
dNTPs (0.2 mM), 5 U of AmpliTaq (Perkin-Elmer), and MgCl 2 (2.5 mM). The DNA amplification 
5 was performed on a Perkin Elmer Cetus DNA thermal cycler. An initial denaturation cycle (2 

minutes at 94°C) was followed by 34 cycles of denaturation at 94°C for 1 minute, annealing at 68°C 
for 1 minute and extension at 72°C for 2 minutes, followed by a final extension after the last cycle at 
72°C for 5 minutes. 

The resultant 294 base pair amplicon was ligated in to the pCR 2.1 vector and transformed 
10 into E. coli following the manufacturer's procedures (TA cloning Kit, Invitrogen, Carlsbad, CA). 

Confirmation of the insert was performed through standard cloning and PCR techniques. 

B. Collection and Preparation of Clinical 
Samples 

15 

Clinical samples for use in MSG-PCR detection of P. carinii can be collected in any 
conventional way. Sputum was collected as described in Bigby et al (Am. Rev. Respir. Dis. 133:515- 
518, 1986), and Kovacs etal (NEJM 318:589-593, 1988). Bronchoalveolar lavage (BAL) was 
performed as described in Ognibene etal (Am. Rev. Respir. Dis. 129:929-932,1984). Oral washes 

20 were carried out by having the subject gargle with 50 cc of normal saline for 10-30 seconds and then 

expectorate the wash into a sample cup (Helweg-Larsen et al (1998) J. Clin. Microbiol 36:2068- 
2072). Serum samples were obtained from blood in a conventional fashion. A 200 uL aliquot of 
serum was used for DNA extraction. 

Oral washes, sputum and bronchoalveolar lavages were spun down 3500 rpm for 10 minutes 

25 and the supernatant decanted, leaving approximately 1 ml of liquid in which to resuspend the pellet. 

Samples were transferred to 2 ml microfiige tubes and centrifuge at 10,000 rpm for 10 minutes to 
remove remaining liquid. A 250 \\L aliquot of InstaGene Matrix (BioRad. Cat. #732-6030, Hercules, 
CA) was added to the pellet and vortexed briefly. The samples were then incubated at 56° C for 20 
minutes, vortexed for 10 seconds and incubated at 100° C for 8 minutes. The samples are vortexed 

30 again for 10 seconds and centrifuged at 12,000 rpm for 3 minutes; 5 uL of the resultant supernatant 

was used in each standard 50 uL PCR reaction. 

In certain experiments, DNA was extracted from samples prepared as above using the 
NucliSens Isolation System (Organon Teknika Corp., Netherlands), using the manufacturer's 
instructions. 

35 

C. Conditions for PCR reactions 

To minimize contamination, DNA extraction, amplification and product detection 
procedures were carried out in separate areas of the laboratory, aerosol-barrier pipette tips were used 
40 for all reagent transfers, and multiple negative controls were included in each experiment. In order to 
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minimize carry-over contamination from amplified samples, all specimens were irradiated with UV 
light after completion of amplification to cross-link the IP-10, which reacts with the PCR product to 
make it unamplifiable while not interfering with detection (Isaacs et al (1991) Nucleic Acids Res, 
19:109-116; Rys and Persing (1993) J. Clin. Microbiol. 31:2356-2360). 

MSG sequence: For PCR amplification of human-P. carinii MSG in clinical samples, the 
upstream primer used was an equimolar mixture of JKK14 (SEQ ID NO: 17) (corresponding to the 
residues of 2887-291 1 of HMSG33, which is also 2845-2869 of hMSGll) and JKK15 (SEQ ID NO: 
1 8) (corresponding to the residues of 2836-2860 of HMSG32). The downstream primer used was 
JKK17 (SEQ ID NO: 20) (complementary to the conserved residues 3106-3 130 of HMSG3 3, which 
is also 3064-3088 of MSG 11). In experiments wherein the amplified product was detected using the 
DELFIA™ system, the downstream primer was biotinylated at the 5' end to allow specific capture of 
amplified sequences through the use of streptavidin. 

PCR amplification was carried out in standard PCR reaction mixture (50 mM KC1, 10 mM 
Tris, pH 8.0, 0.01% gelatin, 3 mM MgCl 2 , 400 uM dNTPs (Boehringer Mannheim), 1 uM each 
oligonucleotide primer, and 0.025 units/ul of Amplitaq (Perkin Elmer Cetus)). The HRJ AmpStop™ 
system was used to control carry-over contaminations; IP-10 (a psoralen derivative) (4 ug/ul) was 
added to each reaction to enable UV cross-linking at the end of the amplification cycle, thereby 
reducing the possibility of cross contaminating of other samples by amplified products (HRI 
Research, Inc., Concord, CA). 

Samples were amplified using one of the following two PCR cycles: (1) an initial 
denaturation cycle (5 minutes at 94° C) was followed by 44 cycles of denaturation at 94° C for 30 
seconds, annealing at 65° C for 1 minute and extension at 72° C for 2 minutes, followed by a final 
extension after the last cycle at 72° C for 5 minutes; (2) an initial denaturation at 96° C for 1 minute 
was followed by 43 cycles of denaturation at 95° C for 1 minute, annealing at 65° C for 1 minute, and 
extension at 72° C for 1 minute, with a final extension time of 10 minutes at 72° C. All specimens 
were irradiated with UV light after completion of cycling to cross-link the incorporated IP-10. 

Mitochondria large subunit rRNA (MRSU): Previously published PCR primers pAZ102- 
E and pAZ102-H were used to amplify P. carinii mitochondrial large subunit rRNA (MRSU) in 
clinical samples (Wakefield et al. (1990) MoL andBiochem. ParasitoL 43:69-76). Primer pAZ102H 
was biotinylated at the 5' end to allow streptavidin-mediated capture of the amplified product in 
experiments wherein the amplified product was detected using the DELFIA™ system. The PCR 
reaction mixture employed was as above. Samples were amplified using one of the following two 
PCR cycles: (1) an initial denaturation cycle (2 minutes at 94° C) was followed by 40 cycles of 
denaturation at 94° C for 1 .5 minutes, annealing at 55° C for 1 .5 minutes and extension at 72° C for 2 
minutes, followed by a final extension after the last cycle at 72° C for 5 minutes; (2) an initial 
denaturation at 96° C for 1 minute was followed by 43 cycles of denaturation at 95° C for 1 minute, 
annealing at 65° C for 1 minute, and extension at 72° C for 1 minute, with a final extension time of 10 
minutes at 72° C. 
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Detection of Amplified PCR Products 
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Southern Blotting: Standard southern blotting techniques were used to confirm the PCR 
results (Tables 2 and 3). Following agarose gel electrophoresis, PCR products were transferred to 
Hybond N+ membranes (Amersham, Live Science, Arlington Heights, IL). Amplification of human- 
P. carinii MSG was detected using probe JKK16 (SEQ ID NO: 19), which corresponds to residues of 
3004-3029 of HMSG33. Amplification of P. carinii MRSU was detected using pAZ102-L2 
(Wakefield etal (1990) Mol and Biochem. Parasitol 43:69-76). Oligonucleotides were labeled 
with [y- 32 P]-ATP by T4 polynucleotide kinase (Ready-to-Go™ Molecular Biology Reagents, 
Pharmacia Biotech, Denmark). Prehybridization and hybridization were performed overnight at 52° 
C in 6 X SSPE, 1% sodium dodecyl sulfate (SDS), 10 X Denhardts' solution (Research Genetics, 
Huntsville, Alabama). Filters were washed at 52° C in 1 x SSPE, 0.5% SDS for 30 min, then 0.1 x 
SSPE, 0.5% SDS for 15 minutes. 

Time-Resolved Fluorescence: Time-resolved fluorescence detection of amplified 
sequences was carried out using the DELFIA® system essentially as described by the manufacturer 
(EG&G Wallac Co.). Using standard procedures, amplicons with incorporated biotin were 
immobilized in streptavidin-coated microtiter plate wells and washed. Europium-labeled JKK16 was 
used to probe for the presence of amplified MSG sequences; europium-labeled pAzl02-L2 was used 
to probe for the presence of amplified RNA sequences. Results are summarized in Tables 4 and 5, in 
comparison to DFA staining. 

F. Comparison of P. carinii Detection Methods 

Oral wash samples were collected along with sputum, induced sputum or BAL. All samples 
were evaluated by direct fluorescent antibody (DFA) staining. DFA staining was performed using a 
commercially available kit per the manufacturer's instructions (Genetics Systems, Seattle, WA). Oral 
wash samples were further tested by PCR, using both primer pairs as detailed above. Summarized 
results from multiple experiments are shown. Table 2 summarizes the results of a comparison 
between DFA staining and MSG and MRSU PCR amplification of BAL samples. Table 3 shows the 
results of a similar comparison using oral wash specimens. Table 4 shows the results of the 
comparison of samples taken via oral wash; results were determined using the Delfia™ hybridization 
capture system. Table 5 shows the results of the comparison of samples taken from serum; results 
were determined using the Delfia™ hybridization capture system. 

The DFA-/PCR+ samples (Table 4) likely represent true positive results based on PCR 
amplification of corresponding sputum samples or concordance between the two PCR methods. One 
patient with PCP diagnosed by BAL had a negative PCR of oral wash and sputum by both methods, 
and negative DFA of induced sputum. These data suggest that PCR performed on oral washes can be 
an accurate, non-invasive means of diagnosing PCP. 
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Table 2: Results of DFA staining compared to MSG and MRSU gene primer PCR amplification in 
BAL specimens, as measured by Southern hybridization. 



Stain Results 



Positive 
Negative 



No. of BAL specimens 
MSG gene primers MRSU Rene primers 



Positive 
7 
0 



Negative 
0 
12 



Positive 
6 
0 



Negative 
1 
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Table 3: Results of DFA staining compared to MSG and MRSU gene primer PCR amplification 
oral wash specimens, as measured by Southern hybridization. 



in 



Stain Results 



No. of oral wash specimens 
MSG gene primers MRSU gene primers 



Positive 
Negative 



Positive 
4 
3 



Negative 
4 
70 



Positive Negative 
3 5 
0 73 



Table 4: Results of DFA staining compared to MSG and MRSU gene primer PCR amplification i 
oral wash specimens, as measured by Delfia™ hybridization capture assay. 



in 



Stain Results 



No. of oral wash specimens 
MSG gene primers MRSU gene primers 



Positive 
Negative 



Positive 
11 
4 



Negative 
0 

157 



Positive Negative 
9 2 
3 158 



40 



Table 5: Results of DFA staining compared to MSG and MRSU gene primer PCR amplification in 
blood serum specimens, as measured by Delfia™ hybridization capture assay. 
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Stain Results 



G. 



Positive 
Negative 



No. of serum specimens 
MSG gene primers MRSU gene primers 



Positive 
3 
0 



Negative 
0 
7 



Positive 
2 
0 



Negative 
1 
7 



Sensitivity of PCR Using Human-/*. 
carinii MSG 



The sensitivity of the PCR assay was tested quantitatively by serial dilution of DNA isolated 
from an autopsy lung sample of an HIV-infected patient with P. carinii pneumonia (as above). From 
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th is DNA preparation, amplified PCR product could be generated with the MSG gene primers 
(JKK14, JKK15 and JKK17) using about as little as 16 fg of genomic DNA containing human P. 
carinii DNA as the template. This amount indicates that MSG gene amplification is about 10 to 100 
fold more sensitive than amplification using the large subunit rRNA gene primers (pAZ102-E and 
pAZ102-H). This calculation is based on total DNA, the vast majority of which is human DNA, not 
P. carinii DNA, since there is no good method for purifying human-P. carinii away from the human 
DNA in a single sample. Amounts of DNA were measured by spectrophotometry. 

The foregoing examples are provided by way of illustration only. One of skill in the art will 
appreciate that numerous variations on the biological molecules and methods described above may be 
employed to make and use oligonucleotide primers for the amplification of human-P. carinii MSG- 
encoding sequences, and for their use in detection and diagnosis of P. carinii in clinical samples. We 
claim all such subject matter that falls within the scope and spirit of the following claims. 
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CLAIMS 



We claim: 

1 . A method of detecting the presence of Pneumocystis carinii in a biological 
specimen, comprising: 

amplifying a human-P. carinii nucleic acid sequence, if such sequence is present in 
the sample, using two or more oligonucleotide primers derived from human-P. carinii MSG protein 
encoding sequence; and 

determining whether an amplified sequence is present. 

2. The method according to claim 1, wherein amplification of the human-P. carinii 
nucleic acid sequence is by polymerase chain reaction. 

3. The method of claim 1, wherein the human-P. carinii nucleic acid sequence is a 
highly conserved region within an MSG-protein encoding sequence. 

4. The method of claim 3, wherein the highly conserved region comprises a sequence 
selected from the group consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 
of HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSG14 
(SEQ ID NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 2887-3 132 of HMSG33 (SEQ ID NO: 

1 1), 2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

5. The method of claim 1, wherein at least one oligonucleotide primer comprises at 
least 1 5 contiguous nucleotides from a sequence chosen from the group consisting of: residues 2894- 
3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSG11 
(SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 
2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of 
HMSGp2 (SEQ ID NO: 15) and nucleic acid sequences having at least 70% sequence homology with 
residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID NO: 3), 2845- 
3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSG 14 (SEQ ID NO: 7), 2836-3081 of HMSG32 
(SEQ ID NO: 9), 2887-3 132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 (SEQ ID NO: 
13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

6. The method of claim 5, wherein at least one oligonucleotide primer comprises at 
least 15 contiguous nucleotides from a nucleic acid sequence having at least 90% sequence homology 
with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID NO: 3), 
2845-3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836-3081 of 
HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 (SEQ 
ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

7. The method of claim 5, wherein at least one oligonucleotide primer comprises at 
least 15 contiguous nucleotides from a nucleic acid sequence having at least 95% sequence homology 
with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID NO: 3), 
2845-3090 of HMSG 11 (SEQ ID NO: 5), 2839-3084 of HMSG 14 (SEQ ID NO: 7), 2836-3081 of 
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HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 11), 2821-3072 of HMSG35 (SEQ 
ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

.8. The method of claim 5, wherein the oligonucleotide primers are chosen from the 

group consisting of: SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID 
NO: 23, and SEQ ID NO: 24. 

9. The method of claim 5, wherein the pair of oligonucleotide primers consist of one 
upstream primer and one downstream primer. 

10. The method of claim 9, wherein: 

the upstream primer is chosen from the group consisting of: SEQ ID NO: 
17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:23; and 

the downstream primer is chosen from the group consisting of: SEQ ID 
NO: 20 and SEQ ID NO: 24. 

1 1 . The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 
ID NO: 17. 

12. The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 
ID NO: 18. 

13. The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 
ID NO: 19. 

14. The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 
ID NO: 20. 

15. The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 
ID NO: 23. 

16. The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 
ID NO: 24. 

17. The method of claim 1, wherein the biological specimen is from the oropharyngeal 

tract. 

18. The method of claim 1 , wherein the biological specimen is from blood. 

19. The method of claim 1, wherein the step of determining whether an amplified 
sequence is present comprises one or more of: 

(a) electrophoresis and staining of the amplified sequence; or 

(b) hybridization to a labeled probe of the amplified sequence. 

20. The method of claim 19, wherein the amplified sequence is detected by 
hybridization to a labeled probe. 

2 1 . The method of claim 22, wherein the probe comprises a detectable non-isotopic 
label chosen from the group consisting of: 

a fluorescent molecule; 

a chemiluminescent molecule; 

an enzyme; 
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a co-factor; 

an enzyme substrate; and 
a hapten. 

22. The method of claim 2 1 , wherein the labeled probe comprises a nucleic acid 
5 sequence according to SEQ ID NO: 19. 

23. A method of detecting the presence of Pneumocystis carinii in a biological 
specimen, comprising: 

exposing the biological specimen to a probe that hybridizes to a human-P. carinii 
nucleic acid sequence, if the sequence is present in the sample to form a hybridization complex; and 
10 determining whether the hybridization complex is present 

wherein the nucleic acid sequence derived from human-/', carinii is an MSG encoding 
sequence. 

24. The method of claim 23, wherein the labeled probe comprises a nucleic acid 
sequence according to SEQ ID NO: 19. 

15 25. A purified protein comprising an amino acid sequence selected from the group 

consisting of 

(a) SEQ ID NO: 2; 

(b) SEQ ID NO: 4; 

(c) SEQ ID NO: 6; 
20 (d) SEQ ID NO: 8; 

(e) SEQ ID NO: 10; 

(f) SEQ ID NO: 12; 

(g) SEQ ID NO: 14; 

and conservative substitutions thereof. 
25 26. An isolated nucleic acid molecule encoding a protein according to claim 25. 

27. The isolated nucleic acid molecule according to claim 26, wherein the nucleic acid 
molecule has a sequence selected from the group consisting of: SEQ ID NO: 1 ; SEQ ID NO: 2; SEQ 
ID NO: 3; SEQ ID NO: 4, SEQ ID NO: 5; SEQ ID NO: 6, SEQ ID NO: 7; SEQ ID NO: 15; and SEQ 
ID NO: 17. 

30 28. An isolated nucleic acid molecule comprising a sequence selected from the group 

consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID 
NO: 3), 2845-3090 of HMSG 11 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836- 
3081 of HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of 
HMSG 3 5 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15); and a sequence with at least 

35 70% sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of 

HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSGU (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ 
ID NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 
2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 
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29. An isolated nucleic acid molecule comprising a sequence selected from the group 
consisting of: at least 15 contiguous nucleotides of the nucleic acid molecule according to claim 28. 

30. An isolated nucleic acid molecule comprising a sequence selected from the group 
consisting of: at least 20 contiguous nucleotides of the nucleic acid molecule according to claim 29. 

5 3 1 . A recombinant vector comprising the nucleic acid molecule according to claim 28. 

32. A transgenic cell comprising the vector according to claim 3 1 . 

33. A kit for detecting a human-P. carinii nucleic acid sequence comprising at least a 
pair of primers each comprising at least 15 contiguous nucleotides of sequence selected from the 
group consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ 

10 ID NO: 3), 2845-3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836- 

3081 of HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of 
HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15); and a sequence with at least 
70% sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of 
HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSG1 1 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ 

15 ID NO: 7), 2836-3081 of H MSG 32 (SEQ ID NO: 9), 2887-3 132 of HMSG33 (SEQ ID NO: 1 1), 

2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

34. A kit for detecting a human-/*, carinii nucleic acid sequence comprising at least a 
pair of primers each comprising at least 20 contiguous nucleotides of sequence selected from the 
group consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ 

20 ID NO: 3), 2845-3090 of HMSG 11 (SEQ ID NO: 5), 2839-3084 of HMSG1 4 (SEQ ID NO: 7), 2836- 

3081 of HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of 
HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15); and a sequence with at least 
70% sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of 
HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSG 14 (SEQ 

25 ID NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 2887-3 132 of HMSG33 (SEQ ID NO: 1 1), 

2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

35. A kit for detecting a human-P. carinii nucleic acid sequence comprising at least a 
pair of primers each comprising at least 30 contiguous nucleotides of sequence selected from the 
group consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ 

30 ID NO: 3), 2845-3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836- 

3081 of HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of 
HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15); and a sequence with at least 
70% sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of 
HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ 

35 ID NO: 7), 2836-3081 of HMSG32 (SEQ ID NO: 9), 2887-3 132 of HMSG33 (SEQ ID NO: 1 1), 

2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 
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36. The kit of claim 33, wherein at least one of the oligonucleotide primers comprises a 
sequence selected from the group consisting of: SEQ ID NO: 17; SEQ ID NO: 18; SEQ ID NO: 19; 
SEQ ID NO: 20; SEQ ID NO: 21; SEQ ID NO: 22; SEQ ID NO: 23; and SEQ ID NO: 24. 

37. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
5 sequence according to SEQ ID NO: 1 7. 

38. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
sequence according to SEQ ID NO: 1 8. 

39. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
sequence according to SEQ ID NO: 19. 

1 0 40. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 

sequence according to SEQ ID NO: 21 . 

41 . The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
sequence according to SEQ ID NO; 22. 

42. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
15 sequence according to SEQ ID NO: 23. 

43 . The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
sequence according to SEQ ID NO: 24. 

44. Antibody raised against the peptide sequence according to SEQ ID NO: 25. 

45. Antibody raised against the peptide sequence according to SEQ ID NO: 26. 
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AMENDED CLAIMS 

[received by the International Bureau on 16 March 2000 (16.03.00); 
original claims 1-45 replaced by amended claims 1-45 (5 pages)] 

5 LA method of detecting the presence of Pneumocystis carinii in a biological 

specimen, comprising: 

amplifying a highly conserved region within a human-/*, carinii nucleic acid 
sequence, if such sequence is present in the sample, using two or more oligonucleotide primers 
derived from human-/*, carinii MSG protein encoding sequence; and 
10 determining whether an amplified sequence is present. 

2. The method according to claim 1 , wherein amplification of the human-P. carinii 
nucleic acid sequence is by polymerase chain reaction. 

3. The method of claim 1, wherein the human-P. carinii nucleic acid sequence is a 
highly conserved region within an MSG-protein encoding sequence. 

15 4. The method of claim 3, wherein the highly conserved region comprises a sequence 

selected from the group consisting of: residues 2894-3042 of H MSG pi (SEQ ID NO: 1), 2758-3006 
of HMSGp3 (SEQ ID NO: 3), 2845-3090 of H 'MSG II (SEQ ID NO: 5), 2839-3084 of HMSG14 
(SEQ ID NO: 7), 2836-3081 of H MSG 32 (SEQ ID NO: 9), 2887-3 132 of HMSG33 (SEQ ID NO: 
1 1), 2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of H 'MSG p2 (SEQ ID NO: 15). 

20 5. The method of claim 1 , wherein at least one oligonucleotide primer comprises at 

least 15 contiguous nucleotides from a sequence chosen from the group consisting of: residues 2894- 
3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSG 11 
(SEQ ID NO: 5), 2839-3084 of H MSG 14 (SEQ ID NO: 7), 2836-3081 of H 'MSG '32 (SEQ ID NO: 9), 
2887-3132 of H MSG 3 3 (SEQ ID NO: 11), 2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of 

25 HMSGp2 (SEQ ID NO: 15) and nucleic acid sequences having at least 70% sequence homology with 

residues 2894-3042 of HMSGpl (SEQ ID NO: I), 2758-3006 of HMSGp3 (SEQ ID NO: 3), 2845- 
3090 of HMSG 11 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836-3081 of HMSG32 
(SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 11), 2821-3072 of HMSG '35 (SEQ ID NO: 
13), and 1-249 of HMSG p2 (SEQ ID NO: 15). 

30 6. The method of claim 5, wherein at least one oligonucleotide primer comprises at 

least 15 contiguous nucleotides from a nucleic acid sequence having at least 90% sequence homology 
with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSG p3 (SEQ ID NO: 3), 
2845-3090 ofHMSGH (SEQ ID NO: 5), 2839-3084 of HMSG 14 (SEQ ID NO: 7), 2836-3081 of 
HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of HMSG35 (SEQ 

35 ID NO: 1 3), and 1 -249 of HMSGp2 (SEQ ID NO: 1 5). 

7. The method of claim 5 ? wherein at least one oligonucleotide primer comprises at 

least 15 contiguous nucleotides from a nucleic acid sequence having at least 95% sequence homology 
with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID NO: 3), 
2845-3090 of HMSG 11 (SEQ ID NO: 5), 2839-3084 of HMSG 14 (SEQ ID NO: 7), 2836-3081 of 
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HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 1 1), 2821-3072 of H MSG 3 5 (SEQ 
ID NO: 13), and 1-249 of HMSGpl (SEQ ID NO: 15). 

8. The method of claim 5, wherein the oligonucleotide primers are chosen from the 
group consisting of: SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO:20, SEQ ID 

5 NO: 23, and SEQ ID NO: 24. 

9. The method of claim 5, wherein the pair of oligonucleotide primers consist of one 
upstream primer and one downstream primer. 

10. The method of claim 9, wherein: 

the upstream primer is chosen from the group consisting of: SEQ ID NO: 
10 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ IDNO:23; and 

the downstream primer is chosen from the group consisting of: SEQ ID 
NO: 20 and SEQ ID NO: 24. 

1 1 . The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 
ID NO: 17. 

15 12. The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 

ID NO: 18. 

13. The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 
ID NO: 19. 

14. The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 
20 ID NO: 20. 

1 5. The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 
ID NO: 23. 

16. The method of claim 8, wherein one of the oligonucleotide primers comprises SEQ 
ID NO: 24. 

25 17. The method of claim 1, wherein the biological specimen is from the oropharyngeal 

tract. 

!8. The method of claim I, wherein the biological specimen is from blood. 

19. The method of claim I , wherein the step of determining whether an amplified 
sequence is present comprises one or more of: 

30 (a) electrophoresis and staining of the amplified sequence; or 

(b) hybridization to a labeled probe of the amplified sequence. 

20. The method of claim 1 9, wherein the amplified sequence is detected by 
hybridization to a labeled probe. 

2 1 . The method of claim 22, wherein the probe comprises a detectable non-isotopic 
35 label chosen from the group consisting of: 

a fluorescent molecule; 

a chemiluminescent molecule; 

an enzyme; 
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a co-factor; 

an enzyme substrate; and 
a hapten. 

22. The method of claim 2 1 , wherein the labeled probe comprises a nucleic acid 
sequence according to SEQ ID NO: 19. 

23. A method of detecting the presence of Pneumocystis carinii in a biological 
specimen, comprising: 

exposing the biological specimen to a probe that hybridizes to a highly conserved 
region within a human-P. carinii nucleic acid sequence, if the sequence is present in the sample to 
form a hybridization complex; and 

determining whether the hybridization complex is present 
wherein the nucleic acid sequence derived from human-P. carinii is an MSG encoding 
sequence. 

24. The method of claim 23, wherein the labeled probe comprises a nucleic acid 
sequence according to SEQ ID NO: 19. 

25. A purified protein comprising an amino acid sequence selected from the group 
consisting of 

(a) SEQ ID NO: 2; 

(b) SEQ ID NO: 4; 

(c) SEQ ID NO: 6; 

(d) SEQ ID NO: 8; 

(e) SEQ ID NO: 10; 
(0SEQ ID NO: 12; 
(g) SEQ ID NO: 14; 

and conservative substitutions thereof. 

26. An isolated nucleic acid molecule encoding a protein according to claim 25. 

27. The isolated nucleic acid molecule according to claim 26, wherein the nucleic acid 
molecule has a sequence selected from the group consisting of: SEQ ID NO: I; SEQ ID NO: 2; SEQ 
ID NO: 3; SEQ ID NO: 4, SEQ ID NO: 5; SEQ ID NO: 6, SEQ ID NO: 7; SEQ ID NO: 15; and SEQ 
ID NO: 17. 

28. An isolated nucleic acid molecule comprising a sequence selected from the group 
consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ ID 
NO: 3), 2845-3090 of HMSGll (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ ID NO: 7), 2836- 
308 1 of HMSG32 (SEQ ID NO: 9), 2887-3 1 32 of HMSG33 (SEQ ID NO: 1 1 ), 282 1 -3072 of 
HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15); and a sequence with at least 
70% sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: I), 2758-3006 of 
HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSGll (SEQ ID NO: 5), 2839-3084 of H MSG 1 4 (SEQ 
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ID NO: 7), 2836-3081 of H MSG 32 (SEQ ID NO: 9), 2887-3 132 of HMSG33 (SEQ ID NO: 1 1), 
2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15). 

29. An isolated nucleic acid molecule comprising a sequence selected from the group 
consisting of: at least 15 contiguous nucleotides of the nucleic acid molecule according to claim 28. 

30. An isolated nucleic acid molecule comprising a sequence selected from the group 
consisting of: at least 20 contiguous nucleotides of the nucleic acid molecule according to claim 29. 

31. A recombinant vector comprising the nucleic acid molecule according to claim 28. 

32. A transgenic cell comprising the vector according to claim 3 1 . 

33. A kit for detecting a human-/', carina nucleic acid sequence comprising at least a 
pair of primers each comprising at least 15 contiguous nucleotides of sequence selected from the 
group consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: I ), 2758-3006 of HMSGp3 (SEQ 
ID NO: 3), 2845-3090 ofHMSGIJ (SEQ ID NO: 5), 2839-3084 of H MSG N (SEQ ID NO: 7), 2836- 
308 1 of HMSG32 (SEQ ID NO: 9), 2887-3 1 32 of HMSG33 (SEQ ID NO: 1 1 ), 282 1 -3072 of 
HMSG35 (SEQ ID NO: 13), and 1-249 of HMSGp2 (SEQ ID NO: 15); and a sequence with at least 
70% sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of 
HMSGp3 (SEQ ID NO: 3), 2845-3090 of H MSG 11 (SEQ ID NO: 5), 2839-3084 of HMSG 14 (SEQ 
ID NO: 7), 2836-3081 of HMSG 32 (SEQ ID NO: 9), 2887-3 132 of HMSG3 3 (SEQ ID NO: 1 1), 
2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of HMSG p2 (SEQ ID NO: 15). 

34. A kit for detecting a human-P. carinii nucleic acid sequence comprising at least a 
pair of primers each comprising at least 20 contiguous nucleotides of sequence selected from the 
group consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: I), 2758-3006 of HMSGp3 (SEQ 
ID NO: 3), 2845-3090 of HMSG 11 (SEQ ID NO: 5), 2839-3084 of HMSG 14 (SEQ ID NO: 7), 2836- 
3081 of HMSG32 (SEQ ID NO: 9), 2887-3132 of HMSG33 (SEQ ID NO: 11), 2821-3072 of 
HMSG35 (SEQ ID NO: 13), and 1-249 of HMSG p2 (SEQ ID NO: 15); and a sequence with at least 
70% sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of 
HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSG 11 (SEQ ID NO: 5), 2839-3084 of HMSG 14 (SEQ 
ID NO: 7), 2836-3081 of HMSG 3 2 (SEQ ID NO: 9), 2887-3 132 of HMSG33 (SEQ ID NO: 1 1), 
2821-3072 of HMSG35 (SEQ ID NO: 13), and 1-249 of HMSG p2 (SEQ ID NO: 15). 

35. A kit for detecting a human-P . carinii nucleic acid sequence comprising at least a 
pair of primers each comprising at least 30 contiguous nucleotides of sequence selected from the 
group consisting of: residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of HMSGp3 (SEQ 
ID NO: 3), 2845-3090 of HMSG 11 (SEQ ID NO: 5), 2839-3084 of HMSG 14 (SEQ ID NO: 7), 2836- 
3081 ofHMSG32 (SEQ ID NO: 9), 2887-3 132 of HMSG '33 (SEQ ID NO: 1 1), 2821-3072 of 
HMSG35 (SEQ ID NO: 13), and 1-249 of HMSG p2 (SEQ ID NO: 15); and a sequence with at least 
70% sequence identity with residues 2894-3042 of HMSGpl (SEQ ID NO: 1), 2758-3006 of 
HMSGp3 (SEQ ID NO: 3), 2845-3090 of HMSG11 (SEQ ID NO: 5), 2839-3084 of HMSG14 (SEQ 
ID NO: 7), 2836-3081 of HMSG 32 (SEQ ID NO: 9), 2887-3 132 of HMSG 3 3 (SEQ ID NO: 1 1), 
2821-3072 of HMSG 3 5 (SEQ ID NO: 13), and 1-249 of HMSG p2 (SEQ ID NO: 15). 
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36. The kit of claim 33, wherein at least one of the oligonucleotide primers comprises a 
sequence selected from the group consisting of: SEQ ID NO: 17; SEQ ID NO: 18; SEQ ID NO: 19; 
SEQ ID NO: 20; SEQ ID NO: 21 ; SEQ ID NO: 22; SEQ ID NO: 23; and SEQ ID NO: 24. 

37. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
sequence according to SEQ ID NO: 1 7. 

38- The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
sequence according to SEQ ID NO: 18. 

39. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
sequence according to SEQ ID NO: 19. 

40. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
sequence according to SEQ ID NO: 21. 

41. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
sequence according to SEQ ID NO: 22. 

42. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
sequence according to SEQ ID NO: 23. 

43. The kit of claim 36, wherein one of the oligonucleotide primers comprises the 
sequence according to SEQ ID NO: 24. 

44. Antibody raised against the peptide sequence according to SEQ ID NO: 25. 

45. Antibody raised against the peptide sequence according to SEQ ID NO: 26. 
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SEQUENCE LISTING 



<110> Kovacs, et al. 

<120> Identification of a region of the major surface 

glycoprotein (MSG) gene of human Pneumocystis carinii 

<130> 53232 

<140> 
<141> 

<160> 26 

<170> Patentln Ver. 2.0 

<210> 1 
<211> 3042 
<212> DNA 

<213> Pneumocystis carinii sp. f „ hominis 

<220> 

<221> CDS 

<222> (1) . . (3042) 

<400> 1 

gtg gcg egg gcg gtt aag egg cag gta aca gga gca tea gga gta gat 4 8 
Val Ala Arg Ala Val Lys Arg Gin Val Thr Gly Ala Ser Gly Val Asp 
15 10 15 

gag gag gaa gtg cgt ctt ttg get tta ata eta aaa gaa gat tct aag 96 
Glu Glu Glu Val Arg Leu Leu Ala Leu lie Leu Lys Glu Asp Ser Lys 
20 25 30 

gat gat aaa aaa tgc gaa gaa aaa tta gaa aaa cat tgc aaa gaa tta 14 4 
Asp Asp Lys Lys Cys Glu Glu Lys Leu Glu Lys His Cys Lys Glu Leu 
35 40 45 

agt gaa gca aat eta act cca gaa caa gta cat gaa aag tta aaa gat 192 
Ser Glu Ala Asn Leu Thr Pro Glu Gin Val His Glu Lys Leu Lys Asp 
50 55 60 

ttc tgt gat age aaa aaa cgt gat aaa aaa tgt aaa gaa eta aaa aaa 240 
Phe Cys Asp Ser Lys Lys Arg Asp Lys Lys Cys Lys Glu Leu Lys Lys 
65 70 75 80 

aat gtt gaa aaa aaa tgc ggt gat ttt aaa aca gaa tta gaa gaa ttg 288 
Asn Val Glu Lys Lys Cys Gly Asp Phe Lys Thr Glu Leu Glu Glu Leu 
85 90 95 

gtg aaa aag gaa get tea aat ttg aaa aat gat gag tgt aca aaa aat 336 
Val Lys Lys Glu Ala Ser Asn Leu Lys Asn Asp Glu Cys Thr Lys Asn 
100 105 110 

gaa caa cag tgc ttg ttt tta gaa gaa gca tgc tct gat ctt aca aag 384 
Glu Gin Gin Cys Leu Phe Leu Glu Glu Ala Cys Ser Asp Leu Thr Lys 
115 120 125 

aat tgc aac gat tta aga aac aaa tgt tat cag aat aag cgt gat aag 432 
Asn Cys Asn Asp Leu Arg Asn Lys Cys Tyr Gin Asn Lys Arg Asp Lys 
130 135 140 

1 



SUBSTITUTE SHEET (RULE 26) 



WO 00/09760 PCT/US99/18750 



gta gca aag gaa gtt ctt tta aga ata ata aaa gga aag aat ttt aaa 
Val Ala Lys Glu Val Leu Leu Arg lie lie Lys Gly Lys Asn Phe Lys 
145 150 155 160 



agt tgt gga aac gat tea ctg aag tgt agt gaa ttg gaa aaa aaa tgt 
Ser Cys Gly Asn Asp Ser Leu Lys Cys Ser Glu Leu Glu Lys Lys Cys 
245 250 255 



aaa tgt aaa aat tta aaa gaa cat gat att ata aaa ggt eta tgc gag 
Lys Cys Lys Asn Leu Lys Glu His Asp lie lie Lys Gly Leu Cys Glu 
340 345 350 



aca gat att gaa gaa aca tgt aaa ttt ttc att tea aaa acc ctt atg 
Thr Asp lie Glu Glu Thr Cys Lys Phe Phe lie Ser Lys Thr Leu Met 
370 375 380 



480 



gat aaa aat tea tgt gaa aat aaa ctg gaa gta tac tgt caa gaa tta 528 

Asp Lys Asn Ser Cys Glu Asn Lys Leu Glu Val Tyr Cys Gin Glu Leu 

165 170 175 

agt caa atg agt gac gaa ttg atg aaa tta tgt ttt gat caa aaa aat 576 

Ser Gin Met Ser Asp Glu Leu Met Lys Leu Cys Phe Asp Gin Lys Asn 
180 185 190 

acg tgt gat aat ctt gta aaa gaa acg caa caa aag tgt gaa tct ttc 624 

Thr Cys Asp Asn Leu Val Lys Glu Thr Gin Gin Lys Cys Glu Ser Phe 
195 200 205 

aaa aat ctt aaa acg gaa att aaa aca ata aag gaa gat gaa caa eta 672 

Lys Asn Leu Lys Thr Glu lie Lys Thr lie Lys Glu Asp Glu Gin Leu 
210 215 220 

aaa aaa aaa tgc cca tta tta tat gaa gaa tgc att ttt tat gat gaa 720 

Lys Lys Lys Cys Pro Leu Leu Tyr Glu Glu Cys lie Phe Tyr Asp Glu 
225 230 235 240 



768 



caa gag aaa aat att act tac aca tta tea tat tea ggg ttt gat cct 816 

Gin Glu Lys Asn lie Thr Tyr Thr Leu Ser Tyr Ser Gly Phe Asp Pro 
260 265 270 

ata gaa cca gaa att aca tta gca gaa gaa gta gac tta gaa gga att 864 

lie Glu Pro Glu lie Thr Leu Ala Glu Glu Val Asp Leu Glu Gly lie 
275 280 285 

tat aga aag gca gca gaa gaa gga act ctt gtt ggg aaa cct tta cca 912 

Tyr Arg Lys Ala Ala Glu Glu Gly Thr Leu Val Gly Lys Pro Leu Pro 
290 295 300 

gca gat get act get ttg gtg gca ttt ttg att caa gat cca tct ctt 960 

Ala Asp Ala Thr Ala Leu Val Ala Phe Leu lie Gin Asp Pro Ser Leu 
305 310 315 320 

aca act caa cga act aac aaa gaa aaa tgt aaa aaa att ctt gaa gat 1008 
Thr Thr Gin Arg Thr Asn Lys Glu Lys Cys Lys Lys lie Leu Glu Asp 

325 330 335 



1056 



gat tat aat gca aat aaa gat aag gac aaa aaa tgc gaa gaa ctt agt 1104 
Asp Tyr Asn Ala Asn Lys Asp Lys Asp Lys Lys Cys Glu Glu Leu Ser 
355 360 365 



1152 
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att cat ttt ttt ggc gat gga aat aaa aat gat gga att att aaa tgg 1200 

lie His Phe Phe Gly Asp Gly Asn Lys Asn Asp Gly lie lie Lys Trp 
385 390 395 400 

ggg aat tta tea acg ttt eta age aat aaa gat tgt aca aaa tta gaa 1248 

Gly Asn Leu Ser Thr Phe Leu Ser Asn Lys Asp Cys Thr Lys Leu Glu 
405 410 415 



teg tat tgt ctt tat ttt gaa aaa age tgt aga age gaa act gca tgc 
Ser Tyr Cys Leu Tyr Phe Glu Lys Ser Cys Arg Ser Glu Thr Ala Cys 
420 425 430 

aag aat ate aga gca gca tgc tac aag aga gga ctt gac aca tta gca 
Lys Asn He Arg Ala Ala Cys Tyr Lys Arg Gly Leu Asp Thr Leu Ala 
435 440 445 

aat gaa gta tta caa aaa gaa atg cga gga atg ctg cat ggt tea aat 
Asn Glu Val Leu Gin Lys Glu Met Arg Gly Met Leu His Gly Ser Asn 
450 455 460 

aaa aca tgg ctt agt ggt ttc caa aaa aaa etc ata gaa gtg tgc aaa 
Lys Thr Trp Leu Ser Gly Phe Gin Lys Lys Leu He Glu Val Cys Lys 
465 470 475 480 

aaa gtg aaa aaa gag aat aaa gga gtt ttt ccg agt aat gaa tta ttt 
Lys Val Lys Lys Glu Asn Lys Gly Val Phe Pro Ser Asn Glu Leu Phe 
485 490 495 

gtc tta tgt gta caa cca tea aaa gca get cga ttg ctt teg cat gat 
Val Leu Cys Val Gin Pro Ser Lys Ala Ala Arg Leu Leu Ser His Asp 
500 505 510 

ctt egg atg aaa act ate ttt ttg caa gac gat ttg aac aga aag cga 
Leu Arg Met Lys Thr He Phe Leu Gin Asp Asp Leu Asn Arg Lys Arg 
515 520 525 

gat ttt cca gtg aaa gaa gac tgc gaa gaa tta tta aag aaa tgt gag 
Asp Phe Pro Val Lys Glu Asp Cys Glu Glu Leu Leu Lys Lys Cys Glu 
530 535 540 

get eta aga aag gat tct aaa aaa att gaa tgg cca tgt cat aca tta 
Ala Leu Arg Lys Asp Ser Lys Lys He Glu Trp Pro Cys His Thr Leu 
545 550 555 560 

age caa aat tgt gat caa ttg aga aac get aaa gaa ttg aaa gaa ctt 
Ser Gin Asn Cys Asp Gin Leu Arg Asn Ala Lys Glu Leu Lys Glu Leu 
565 570 575 

tta eta aat gaa cat aag gat ata ttg aaa aat caa gag aat tgt gga 
Leu Leu Asn Glu His Lys Asp He Leu Lys Asn Gin Glu Asn Cys Gly 
580 585 590 

atg tat ttg aag gag aaa tgc aat gaa tgg tct aga agg aga aat gaa 
Met Tyr Leu Lys Glu Lys Cys Asn Glu Trp Ser Arg Arg Arg Asn Glu 
595 600 605 

cgt ttc tct ctt tta tgt get ttg caa aat agg act tgc aga ata atg 
Arg Phe Ser Leu Leu Cys Ala Leu Gin Asn Arg Thr Cys Arg He Met 
610 615 620 

gta gaa gat gtg aaa aat caa tgc aaa ata ttt gaa aaa aac att aaa 

3 



1296 



1344 



1392 



1440 



1488 



1536 



1584 



1632 



1680 



1728 



1776 



1824 



1872 



1920 
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Val Glu Asp Val Lys Asn Gin Cys Lys lie Phe Glu Lys Asn lie Lys 
625 630 635 * 640 

aaa tac caa ggt att gat agt aaa act aaa ata gaa gaa tta ggg aca 1968 
Lys Tyr Gin Gly lie Asp Ser Lys Thr Lys lie Glu Glu Leu Gly Thr 
645 650 655 

tat tgt cct att tgg cac cca cac tgc cat aga ttt gga ccc aat tgc 2016 
Tyr Cys Pro lie Trp His Pro His Cys His Arg Phe Gly Pro Asn Cys 
660 665 670 

ccg gat ctt gaa aaa aat aaa tgt gaa gac ttt gaa aaa tat tgc aaa 2064 
Pro Asp Leu Glu Lys Asn Lys Cys Glu Asp Phe Glu Lys Tyr Cys Lys 
675 680 685 

cct tat tat aag caa aga gac ctt gaa aat gca ctt ata ttt gag ttt 2112 
Pro Tyr Tyr Lys Gin Arg Asp Leu Glu Asn Ala Leu lie Phe Glu Phe 
690 695 700 

aga gga cat ctt gat aag aaa aaa aac tgc aaa aca aat ctt gat aag 2160 
Arg Gly His Leu Asp Lys Lys Lys Asn Cys Lys Thr Asn Leu Asp Lys 
705 710 715 720 

tac tgt aca eta tgg gat caa aca gga aat aaa aca ctt aaa ggt ttt 2208 
Tyr Cys Thr Leu Trp Asp Gin Thr Gly Asn Lys Thr Leu Lys Gly Phe 
725 730 735 

tgt aac agt tct act gat aac aat gaa aca ttt aga gat aaa ctt tgc 2256 
Cys Asn Ser Ser Thr Asp Asn Asn Glu Thr Phe Arg Asp Lys Leu Cys 
740 745 750 

gaa aaa eta gtt cag cgt gtg aaa gaa aaa tgc caa gga tta tea aaa 2304 
Glu Lys Leu Val Gin Arg Val Lys Glu Lys Cys Gin Gly Leu Ser Lys 
755 760 765 

gaa ctt gaa aaa gca aaa aat gat tta gaa gaa aaa cat aaa gat tat 2352 
Glu Leu Glu Lys Ala Lys Asn Asp Leu Glu Glu Lys His Lys Asp Tyr 
770 775 780 

gaa aaa gta aaa aag gat aca aaa aat gca atg gaa gaa aca aat etc 2400 
Glu Lys Val Lys Lys Asp Thr Lys Asn Ala Met Glu Glu Thr Asn Leu 
785 790 795 800 

gtt ttt tea aca act aaa tea aca gat aat aaa aca gaa aaa gga gtc 2448 
Val Phe Ser Thr Thr Lys Ser Thr Asp Asn Lys Thr Glu Lys Gly Val 
805 810 815 

aag cct agt acg cct agt gta gtt caa gat att gta cat ttt aaa ctt 2496 
Lys Pro Ser Thr Pro Ser Val Val Gin Asp He Val His Phe Lys Leu 
820 825 830 

gta aaa aga aat gaa aaa gtt caa gtg aca gaa aaa gaa gca aaa gcg 254 4 
Val Lys Arg Asn Glu Lys Val Gin Val Thr Glu Lys Glu Ala Lys Ala 
835 840 845 



ttt gat ttg gta gca eta gca ttc agt ctt tat gta gag tta aaa gaa 
Phe Asp Leu Val Ala Leu Ala Phe Ser Leu Tyr Val Glu Leu Lys Glu 
850 855 860 



2592 



acg tgt cac cat eta aag gat gat tgc gaa ttt aga aaa gaa tgt aaa 2640 
Thr Cys His His Leu Lys Asp Asp Cys Glu Phe Arg Lys Glu Cys Lys 

4 



SUBSTITUTE SHEET (RULE 26) 



WO 00/09760 PCT/US99/18750 

865 870 ' 875 8€TD 

tgt aaa gac cag tgc aaa gag ata gaa aaa ata tgt tta aaa ata gaa 2688 
Cys Lys Asp Gin Cys Lys Glu lie Glu Lys lie Cys Leu Lys lie Glu 
885 890 895 

cca ctg aaa gta aag cca cat gaa ata aaa aca gta acg gaa acc aac 2736 
Pro Leu Lys Val Lys Pro His Glu lie Lys Thr Val Thr Glu Thr Asn 
900 905 910 

ata aca aca gtc aca gaa aca gtc aaa gaa gca gaa aaa aca gta gga 2784 
He Thr Thr Val Thr Glu Thr Val Lys Glu Ala Glu Lys Thr Val Gly 
915 920 925 

gac gga gag aaa tgc aaa tct etc age aca aca gac acg tgg gtc aca 2832 
Asp Gly Glu Lys Cys Lys Ser Leu Ser Thr Thr Asp Thr Trp Val Thr 
930 935 940 

aag acg tea acc cat acc age acc tec acg act acg tec aca gtt acg 2880 
Lys Thr Ser Thr His Thr Ser Thr Ser Thr Thr Thr Ser Thr Val Thr 
945 950 955 960 

tea aga ata aca ctg acc teg acg agg egg tgt aag cct acg aag tgt 2928 
Ser Arg He Thr Leu Thr Ser Thr Arg Arg Cys Lys Pro Thr Lys Cys 
965 970 "* 975 

acg aca gga gag gaa gat gaa gca gga gag gtg aag ccg agt gag ggg 2976 
Thr Thr Gly Glu Glu Asp Glu Ala Gly Glu Val Lys Pro Ser Glu Gly 
980 985 990 

ctg agg atg agt ggg tgg agt gtg atg aga ggg gtg tta tta gca atg 3024 
Leu Arg Met Ser Gly Trp Ser Val Met Arg Gly Val Leu Leu Ala Met 
995 1000 1005 

atg att tea ttc atg att 3042 
Met He Ser Phe Met He 
1010 

<210> 2 
<211> 1014 
<212> PRT 

<213> Pneumocystis carinii sp. f. hominis 
<400> 2 

Val Ala Arg Ala Val Lys Arg Gin Val Thr Gly Ala Ser Gly Val Asp 
15 10 15 

Glu Glu Glu Val Arg Leu Leu Ala Leu lie Leu Lys Glu Asp Ser Lys 
20 25 30 

Asp Asp Lys Lys Cys Glu Glu Lys Leu Glu Lys His Cys Lys Glu Leu 
35 40 45 

Ser Glu Ala Asn Leu Thr Pro Glu Gin Val His Glu Lys Leu Lys Asp 
50 55 60 

Phe Cys Asp Ser Lys Lys Arg Asp Lys Lys Cys Lys Glu Leu Lys Lys 
65 70 75 80 

Asn Val Glu Lys Lys Cys Gly Asp Phe Lys Thr Glu Leu Glu Glu Leu 
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85 90 95 

Val Lys Lys Glu Ala Ser Asn Leu Lys Asn Asp Glu Cys Thr Lys Asn 
100 105 110 

Glu Gin Gin Cys Leu Phe Leu Glu Glu Ala Cys Ser Asp Leu Thr Lys 
115 120 125 

Asn Cys Asn Asp Leu Arg Asn Lys Cys Tyr Gin Asn Lys Arg Asp Lys 
130 135 140 

Val Ala Lys Glu Val Leu Leu Arg lie lie Lys Gly Lys Asn Phe Lys 
145 150 155 160 

Asp Lys Asn Ser Cys Glu Asn Lys Leu Glu Val Tyr Cys Gin Glu Leu 
165 170 175 

Ser Gin Met Ser Asp Glu Leu Met Lys Leu Cys Phe Asp Gin Lys Asn 
180 185 190 

Thr Cys Asp Asn Leu Val Lys Glu Thr Gin Gin Lys Cys Glu Ser Phe 
195 200 205 

Lys Asn Leu Lys Thr Glu lie Lys Thr lie Lys Glu Asp Glu Gin Leu 
210 215 220 

Lys Lys Lys Cys^Pro Leu Leu Tyr Glu Glu Cys He Phe Tyr Asp Glu, 
225 ) 230 235 240 

Ser Cys Gly Asn Asp Ser Leu Lys Cys Ser Glu Leu Glu Lys Lys Cys 
245 250 255 

Gin Glu Lys Asn He Thr Tyr Thr Leu Ser Tyr Ser Gly Phe Asp Pro 
260 265 270 

He Glu Pro Glu He Thr Leu Ala Glu Glu Val Asp Leu Glu Gly He 
275 280 285 

Tyr Arg Lys Ala Ala Glu Glu Gly Thr Leu Val Gly Lys Pro Leu Pro 
290 295 300 

Ala Asp Ala Thr Ala Leu Val Ala Phe Leu He Gin Asp Pro Ser Leu 
305 310 315 320 

Thr Thr Gin Arg Thr Asn Lys Glu Lys Cys Lys Lys lie Leu Glu Asp 
325 330 335 

Lys Cys Lys Asn Leu Lys Glu His Asp He He Lys Gly Leu Cys Glu 
340 345 350 

Asp Tyr Asn Ala Asn Lys Asp Lys Asp Lys Lys Cys Glu Glu Leu Ser 
355 360 365 

Thr Asp He Glu Glu Thr Cys Lys Phe Phe He Ser Lys Thr Leu Met 
370 375 380 

He His Phe Phe Gly Asp Gly Asn Lys Asn Asp Gly He He Lys Trp 
385 390 395 400 

Gly Asn Leu Ser Thr Phe Leu Ser Asn Lys Asp Cys Thr Lys Leu Glu 
405 410 415 
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Ser Tyr Cys Leu Tyr Phe Glu Lys Ser Cys Arg Ser Giu Thr Ala Cys 
420 425 430 

Lys Asn He Arg Ala Ala Cys Tyr Lys Arg Gly Leu Asp Thr Leu Ala 
435 440 445 

Asn Glu Val Leu Gin Lys Glu Met Arg Gly Met Leu His Gly Ser Asn 
450 455 460 

Lys Thr Trp Leu Ser Gly Phe Gin Lys Lys Leu He Glu Val Cys Lys 
465 470 475 480 

Lys Val Lys Lys Glu Asn Lys Gly Val Phe Pro Ser Asn Glu Leu Phe 
485 490 495 

Val Leu Cys Val Gin Pro Ser Lys Ala Ala Arg Leu Leu Ser His Asp 
500 505 510 

Leu Arg Met Lys Thr He Phe Leu Gin Asp Asp Leu Asn Arg Lys Arg 
515 520 525 

Asp Phe Pro Val Lys Glu Asp Cys Glu Glu Leu Leu Lys Lys Cys Glu 
530 535 540 

Ala Leu Arg Lys Asp Ser Lys Lys He Glu Trp Pro Cys His Thr Leu 
545 550 555 560 

Ser Gin Asn Cys Asp Gin Leu Arg Asn Ala Lys Glu Leu Lys Glu Leu 
565 570 575 

Leu Leu Asn Glu His Lys Asp He Leu Lys Asn Gin Glu Asn Cys Gly 
580 585 590 

Met Tyr Leu Lys Glu Lys Cys Asn Glu Trp Ser Arg Arg Arg Asn Glu 
595 600 605 

Arg Phe Ser Leu Leu Cys Ala Leu Gin Asn Arg Thr .Cys Arg He Met 
610 615 620 

Val Glu Asp Val Lys Asn Gin Cys Lys He Phe Glu Lys Asn He Lys 
625 630 635 640 

Lys Tyr Gin Gly He Asp Ser Lys Thr Lys He Glu Glu Leu Gly Thr 
645 650 655 

Tyr Cys Pro He Trp His Pro His Cys His Arg Phe Gly Pro Asn Cys 
660 665 670 

Pro Asp Leu Glu Lys Asn Lys Cys Glu Asp Phe Glu Lys Tyr Cys Lys 
675 680 685 

Pro Tyr Tyr Lys Gin Arg Asp Leu Glu Asn Ala Leu He Phe Glu Phe 
690 695 700 

Arg Gly His Leu Asp Lys Lys Lys Asn Cys Lys Thr Asn Leu Asp Lys 
705 710 715 720 

Tyr Cys Thr Leu Trp Asp Gin Thr Gly Asn Lys Thr Leu Lys Gly Phe 
725 730 735 
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Cys Asn Ser Ser Thr Asp Asn Asn Glu Thr Phe Arg Asp Lys Leu Cys 
740 745 750 

Glu Lys Leu Val Gin Arg Val Lys Glu Lys Cys Gin Gly Leu Ser Lys 
755 760 765 

Glu Leu Glu Lys Ala Lys Asn Asp Leu Glu Glu Lys His Lys Asp Tyr 
770 775 780 

Glu Lys Val Lys Lys Asp Thr Lys Asn Ala Met Glu Glu Thr Asn Leu 
785 790 795 800 

Val Phe Ser Thr Thr Lys Ser Thr Asp Asn Lys Thr Glu Lys Gly Val 
805 810 815 

Lys Pro Ser Thr Pro Ser Val Val Gin Asp lie Val His Phe Lys Leu 
820 825 830 

Val Lys Arg Asn Glu Lys Val Gin Val Thr Glu Lys Glu Ala Lys Ala 
835 840 845 

Phe Asp Leu Val Ala Leu Ala Phe Ser Leu Tyr Val Glu Leu Lys Glu 
850 855 860 

Thr Cys His His Leu Lys Asp Asp Cys Glu Phe Arg Lys Glu Cys Lys 
865 870 875 880 

Cys Lys Asp Gin Cys Lys Glu lie Glu Lys lie Cys Leu Lys lie Glu 
885 890 895 

Pro Leu Lys Val Lys Pro His Glu lie Lys Thr Val Thr Glu Thr Asn 
900 905 910 

He Thr Thr Val Thr Glu Thr Val Lys Glu Ala Glu Lys Thr Val Gly 
915 920 925 

Asp Gly Glu Lys Cys Lys Ser Leu Ser Thr Thr Asp Thr Trp Val Thr 
930 935 940 

Lys Thr Ser Thr His Thr Ser Thr Ser Thr Thr Thr Ser Thr Val Thr 
945 950 955 960 

Ser Arg He Thr Leu Thr Ser Thr Arg Arg Cys Lys Pro Thr Lys Cys 
965 970 975 

Thr Thr Gly Glu Glu Asp Glu Ala Gly Glu Val Lys Pro Ser Glu Gly 
980 985 990 

Leu Arg Met Ser Gly Trp Ser Val Met Arg Gly Val Leu Leu Ala Met 
995 1000 1005 



Met He Ser Phe Met He 
1010 



<210> 3 
<211> 3006 
<212> DNA 

<213> Pneumocystis carinii sp. f. hominis 
<220> 
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<221> CDS 
<222> (1) . . (3006) 

<400> 3 

gtg gcg egg gcg gtc aag egg egg get gca gca cag aat agt gtt gaa 48 
Val Ala Arg Ala Val Lys Arg Arg Ala Ala Ala Gin Asn Ser Val Glu 
1 5 10 15 



gaa gaa tat ctt ttg get ttg att tta gaa aat gag tat gaa aat aat 
Glu Glu Tyr Leu Leu Ala Leu lie Leu Glu Asn Glu Tyr Glu Asn Asn 
20 25 30 



gtt aaa gca aag tgt act agt ttt caa aca gaa ctt gat aaa gca gtc 
Val Lys Ala Lys Cys Thr Ser Phe Gin Thr Glu Leu Asp Lys Ala Val 
85 90 95 



aaa gaa agt gat gag eta ata aaa tta tgt ctt gac gaa gaa aaa acg 
Lys Glu Ser Asp Glu Leu lie Lys Leu Cys Leu Asp Glu Glu Lys Thr 
180 185 190 

tgt gga gat ctt gta tct aag aaa gaa tac aaa tgc aaa cct etc aaa 
Cys Gly Asp Leu Val Ser Lys Lys Glu Tyr Lys Cys Lys Pro Leu Lys 
195 200 205 



tgt tta tta ttt ctt gaa gaa tgt tac ttt tat ggg tea aac tgt gaa 

9 



96 



gat aaa tgt aaa aaa agg ttg aaa gag tat tgt gaa gtt tta aaa aat 144 

Asp Lys Cys Lys Lys Arg Leu Lys Glu Tyr Cys Glu Val Leu Lys Asn 
35 40 45 

gta aca aaa gaa cca aaa aaa eta gaa gaa aag tta gac gga ate tgc 192 

Val Thr Lys Glu Pro Lys Lys Leu Glu Glu Lys Leu Asp Gly lie Cys 
50 55 60 

aaa gat gat aaa aca ata gaa gca aaa tgc aaa gaa tea gaa aca aag 240 

Lys Asp Asp Lys Thr lie Glu Ala Lys Cys Lys Glu Ser Glu Thr Lys 
65 70 75 80 



288 



aaa aag gga get tea aca tta gaa gat aat gat tgt aag aag aat gaa 336 

Lys Lys Gly Ala Ser Thr Leu Glu Asp Asn Asp Cys Lys Lys Asn Glu 

100 105 110 

cga caa tgc ctg ttt ttg gag gga gca tgt cca aca gaa ctt aaa gat 384 

Arg Gin Cys Leu Phe Leu Glu Gly Ala Cys Pro Thr Glu Leu Lys Asp 
115 120 125 

aaa tgt aat gaa ctg agg aat aaa tgt tat caa aaa aaa cga gac gac 432 

Lys Cys Asn Glu Leu Arg Asn Lys Cys Tyr Gin Lys Lys Arg Asp Asp 
130 135 140 

gta gca gaa aaa get ctt tta aga gta ctt aga ggg aac ctt aag gat 

Val Ala Glu Lys Ala Leu Leu Arg Val Leu Arg Gly Asn Leu Lys Asp 
145 150 155 160 

aaa aac aca tgc aaa aat aag tta aag ggg gtt tgt caa gaa ttc aac 528 

Lys Asn Thr Cys Lys Asn Lys Leu Lys Gly Val Cys Gin Glu Phe Asn 
165 170 175 



480 



576 



624 



gaa gga att gat eta gtg ctt gga aag gaa gat tta tta aaa gaa aaa 672 
Glu Gly lie Asp Leu Val Leu Gly Lys Glu Asp Leu Leu Lys Glu Lys 
210 215 220 



720 
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Cys Leu Leu Phe Leu Glu Glu Cys Tyr Phe Tyr Giy Ser Asn Cys Glu 
225 230 235 240 

aca gat cag cca aag tgt aaa gag ttt gca age aaa tgt caa aag gaa 768 

Thr Asp Gin Pro Lys Cys Lys Glu Phe Ala Ser Lys Cys Gin Lys Glu 
245 250 255 

aat etc gtt tat gca gca cca ggt tea cac ttt gat cct acg aaa tta 816 

Asn Leu Val Tyr Ala Ala Pro Gly Ser His Phe Asp Pro Thr Lys Leu 
260 265 270 

aag att agg tta gca gaa gaa ata gac eta gaa aaa ttg tac gta gaa 864 

Lys lie Arg Leu Ala Glu Glu lie Asp Leu Glu Lys Leu Tyr Val Glu 

275 280 285 

gca gtg aaa aag gga att cat att gga agg cca tea ata aaa gat gaa 912 

Ala Val Lys Lys Gly lie His lie Gly Arg Pro Ser lie Lys Asp Glu 
290 295 300 

gtc get tta ttg gca tta tta age aag agt gat get caa aat act ttt 960 

Val Ala Leu Leu Ala Leu Leu Ser Lys Ser Asp Ala Gin Asn Thr Phe 
305 310 315 320 

aaa gat caa tgt gaa gat gtt att aaa aaa aaa tgt gga aac ttt aaa 1008 

Lys Asp Gin Cys Glu Asp Val lie Lys Lys Lys Cys Gly Asn Phe Lys 
325 330 335 

gag cat att att tta aaa gat tta tgt agt aat aag act ate act gat 1056 

Glu His lie lie Leu Lys Asp Leu Cys Ser Asn Lys Thr lie Thr Asp 
340 345 350 

aat cca aaa gaa aaa tgc gaa gaa eta aat aag gag tta aca acc cgt 1104 

Asn Pro Lys Glu Lys Cys Glu Glu Leu Asn Lys Glu Leu Thr Thr Arg 

355 360 365 

att tta act gtt tct aaa agg att gag aaa tat ttc get cca get aat 1152 

lie Leu Thr Val Ser Lys Arg lie Glu Lys Tyr Phe Ala Pro Ala Asn 
370 375 380 

gta aag gaa att att ggt tgg cat atg ttg cat aca ttt ctt ggt gaa 1200 

Val Lys Glu lie lie Gly Trp His Met Leu His Thr Phe Leu Gly Glu 
385 390 395 400 

aga gag tgt acg aaa ctg ttg teg gat tgt ttt tat ttg aaa age caa 1248 

Arg Glu Cys Thr Lys Leu Leu Ser Asp Cys Phe Tyr Leu Lys Ser Gin 
405 410 415 

get cca ctt gaa aag ccc tgc aat aac tta aaa gca gca tgt tat aaa 1296 

Ala Pro Leu Glu Lys Pro Cys Asn Asn Leu Lys Ala Ala Cys Tyr Lys 
420 425 430 

aaa ggg ctt gaa gca gta gca aat gaa gca tta caa gat aag tta egg 1344 

Lys Gly Leu Glu Ala Val Ala Asn Glu Ala Leu Gin Asp Lys Leu Arg 

435 440 445 

gga aaa ttg caa ggt tea aat aga aca tgg ctt gaa acc ctt caa aaa 1392 

Gly Lys Leu Gin Gly Ser Asn Arg Thr Trp Leu Glu Thr Leu Gin Lys 
450 455 460 

aac ttg gta aaa gtt tgt gaa aag acg aaa gga gaa agt gat gaa tta 1440 

Asn Leu Val Lys Val Cys Glu Lys Thr Lys Gly Glu Ser Asp Glu Leu 
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465 470 " 475 480 

ttt gta eta tgt atg aac cca ata aaa acg get ctt aca gtg tea aca 

Phe Val Leu Cys Met Asn Pro lie Lys Thr Ala Leu Thr Val Ser Thr 
485 490 495 



1488 



gat ttg cga atg agg gca gtt get ttg caa gag cat ttg aac gaa aaa 1536 
Asp Leu Arg Met Arg Ala Val Ala Leu Gin Glu His Leu Asn Glu Lys 
500 505 510 



cga gat ttt cca aca gaa aag gat tgt aaa gaa tta gag aaa aaa tgt 
Arg Asp Phe Pro Thr Glu Lys Asp Cys Lys Glu Leu Glu Lys Lys Cys 
515 520 525 

gag gtc tta gga aaa gat tea aga gaa att aaa tgg tea tgt tat acg 
Glu Val Leu Gly Lys Asp Ser Arg Glu lie Lys Trp Ser Cys Tyr Thr 
530 535 540 

tta aaa cag cat tgc aat egg ctg aag age ata gag cac tta gaa gag 
Leu Lys Gin His Cys Asn Arg Leu Lys Ser lie Glu His Leu Glu Glu 
545 550 555 560 

gag ttg eta aaa gaa aat aaa gga tat tta aaa gat gaa aat age tgc 
Glu Leu Leu Lys Glu Asn Lys Gly Tyr Leu Lys Asp Glu Asn Ser Cys 
565 570 575 

aaa gaa gaa get aag aaa cga tgt gaa aaa tgg ttt aga aga gaa aat 
Lys Glu Glu Ala Lys Lys Arg Cys Glu Lys Trp Phe Arg Arg Glu Asn 
580 585 590 

aat aaa ttt ttt teg get tgt tct gac ttg gaa ctt gtt tgc aaa aag 
Asn Lys Phe Phe Ser Ala Cys Ser Asp Leu Glu Leu Val Cys Lys Lys 
595 600 605 

ate act aga aat gtt gaa tct aaa tgt aat ata ttg aaa gga cat atg 
lie Thr Arg Asn Val Glu Ser Lys Cys Asn lie Leu Lys Gly His Met 
610 615 620 

gaa act atg aac gtt ata agt gaa ata get aaa aaa gag gaa aaa ata 
Glu Thr Met Asn Val lie Ser Glu lie Ala Lys Lys Glu Glu Lys lie 
625 630 635 640 

tgt gaa ttt tgg get cca tat tgt aaa aag tac gag caa aat tgt gaa 
Cys Glu Phe Trp Ala Pro Tyr Cys Lys Lys Tyr Glu Gin Asn Cys Glu 
645 650 655 

aaa ctt aaa aac gga gga aaa gat ggg caa tgc aaa aaa etc aat aaa 
Lys Leu Lys Asn Gly Gly Lys Asp Gly Gin Cys Lys Lys Leu Asn Lys 
660 665 670 

aag tgc aaa tea ttc ctt gaa aaa gaa get tta gaa aat aaa gtt gta 
Lys Cys Lys Ser Phe Leu Glu Lys Glu Ala Leu Glu Asn Lys Val Val 
675 680 685 

gaa gaa ttg aaa ggt agt tta tea aac gta gga gaa tgt aac aat aca 
Glu Glu Leu Lys Gly Ser Leu Ser Asn Val Gly Glu Cys Asn Asn Thr 
690 695 700 

ctt aat ata tac tgt aca caa ttg aaa aag gca gag aat ggg ttg gaa 

Leu Asn lie Tyr Cys Thr Gin Leu Lys Lys Ala Glu Asn Gly Leu Glu 

705 710 715 720 
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1584 



1632 



1680 



1728 



1776 



1824 



1872 



1920 



1968 



2016 



2064 



2112 



2160 
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act ttg tgc aaa age aaa gaa aac acc aag agt gac att aaa gtt aga 2208 
Thr Leu Cys Lys Ser Lys Glu Asn Thr Lys Ser Asp lie Lys Val Arg 
725 730 735 

gaa gaa etc tgt gaa aag eta ata aaa cgt ata aaa gaa aaa tgc tea 
Glu Glu Leu Cys Glu Lys Leu lie Lys Arg lie Lys Glu Lys Cys Ser 
740 745 750 

aaa ttg aag gac gag ctt gaa gaa gta aaa gag gtc tta gaa aag aaa 
Lys Leu Lys Asp Glu Leu Glu Glu Val Lys Glu Val Leu Glu Lys Lys 
755 760 765 

gaa gaa aag tat aaa aaa att aaa gaa gaa gca gaa aaa gec atg gaa 
Glu Glu Lys Tyr Lys Lys lie Lys Glu Glu Ala Glu Lys Ala Met Glu 
770 775 780 

gat gca aac ctt att tta teg aga gcg aaa gga cct gat aat aat aat 
Asp Ala Asn Leu lie Leu Ser Arg Ala Lys Gly Pro Asp Asn Asn Asn 
785 790 795 800 

aat aag tea gta aat aaa gac tea tct gat aca cct aag gaa gga aaa 
Asn Lys Ser Val Asn Lys Asp Ser Ser Asp Thr Pro Lys Glu Gly Lys 
805 810 815 

ggc aca aca gga ttt aaa ctt gta aga aga aat gca aaa gtg cat gta 
Gly Thr Thr Gly Phe Lys Leu Val Arg Arg Asn Ala Lys Val His Val. 
820 825 830 

aca gaa aaa gaa tta gca gca ttt gat ttg gta gca aga gca ttt gat 
Thr Glu Lys Glu Leu Ala Ala Phe Asp Leu Val Ala Arg Ala Phe Asp 
835 840 845 

etc tat eta gaa ttg aaa gaa ata tgt aat cat tea ctg aag aat tgt 
Leu Tyr Leu Glu Leu Lys Glu lie Cys Asn His Ser Leu Lys Asn Cys 
850 855 860 

ggt ttc aaa aaa gag tgt gac tgt gag gat cca tgt aaa aag ata cag 
Gly Phe Lys Lys Glu Cys Asp Cys Glu Asp Pro Cys Lys Lys lie Gin 
865 870 875 880 

gga ata tgt tea aca tta gag cca eta aaa gtg aga cca cac gaa ata 
Gly He Cys Ser Thr Leu Glu Pro Leu Lys Val Arg Pro His Glu He 
885 890 895 

gta act aaa aac ata aca act aca acc aca acc acc acc aca act acc 2736 
Val Thr Lys Asn He Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr 
900 905 910 



2256 



2304 



2352 



2400 



2448 



2496 



2544 



2592 



2640 



2688 



att aaa gac gca aag gca aca gac tgc cac tct tta cag aca aca gat 
He Lys Asp Ala Lys Ala Thr Asp Cys His Ser Leu Gin Thr Thr Asp 
915 920 925 

acg tgg gtc aca aag acg teg acc cat act age aca tec aca acc aca 
Thr Trp Val Thr Lys Thr Ser Thr His Thr Ser Thr Ser Thr Thr Thr 
930 935 940 

tct aca gtc acg tea aga ata acg ttg acc teg aca aga egg tgt aag 
Ser Thr Val Thr Ser Arg He Thr Leu Thr Ser Thr Arg Arg Cys Lys 
945 950 955 960 
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cct acg aag tgt acg aca gga gag gaa gat gaa gca gga gac gtg aaa 2928 

Pro Thr Lys Cys Thr Thr Gly Glu Glu Asp Glu Aia Gly Asp Val Lys 
965 970 975 



ccg agt gaa ggg ttrg agg atg agt gga tgg agt gtg atg agg ggg gtg 
Pro Ser Glu Gly Leu Arg Met Ser Gly Trp Ser Val Met Arg Gly Val 
980 985 990 

tta tta gca atg acg att tea ttc atg att 
Leu Leu Ala Met Thr He Ser Phe Met He 
995 1000 



2976 



3006 



<210> 4 
<211> 1002 
<212> PRT 

<213> Pneumocystis carinii sp. f. hominis 
<400> 4 

Val Ala Arg Ala Val Lys Arg Arg Ala Ala Ala Gin Asn Ser Val Glu 
1 5 10 15 

Glu Glu Tyr Leu Leu Ala Leu He Leu Glu Asn Glu Tyr Glu Asn Asn 
20 25 30 

Asp Lys Cys Lys Lys Arg Leu Lys Glu Tyr Cys Glu Val Leu Lys Asn 
35 40 45 

Val Thr Lys Glu Pro Lys Lys Leu Glu Glu Lys Leu Asp Gly He Cys 
50 55 60 

Lys Asp Asp Lys Thr He Glu Ala Lys Cys Lys Glu Ser Glu Thr Lys 
65 70 75 80 

Val Lys Ala Lys Cys Thr Ser Phe Gin Thr Glu Leu Asp Lys Ala Val 
85 90 95 

Lys Lys Gly Ala Ser Thr Leu Glu Asp Asn Asp Cys Lys Lys Asn Glu 
100 105 HO 

Arg Gin Cys Leu Phe Leu Glu Gly Ala Cys Pro Thr Glu Leu Lys Asp 
115 120 125 

Lys Cys Asn Glu Leu Arg Asn Lys Cys Tyr Gin Lys Lys Arg Asp Asp 
130 135 140 

Val Ala Glu Lys Ala Leu Leu Arg Val Leu Arg Gly Asn Leu Lys Asp 
145 150 155 160 

Lys Asn Thr Cys Lys Asn Lys Leu Lys Gly Val Cys Gin Glu Phe Asn 
165 170 175 

Lys Glu Ser Asp Glu Leu He Lys Leu Cys Leu Asp Glu Glu Lys Thr 
180 185 190 

Cys Gly Asp Leu Val Ser Lys Lys Glu Tyr Lys Cys Lys Pro Leu Lys 
195 200 205 

Glu Gly He Asp Leu Val Leu Gly Lys Glu Asp Leu Leu Lys Glu Lys 
210 215 220 
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Cys Leu Leu Phe Leu Glu Glu Cys Tyr Phe Tyr Gly Ser Astt**Gys* Glu 
225 230 235 240 

Thr Asp Gin Pro Lys Cys Lys Glu Phe Ala Ser Lys Cys Gin Lys Glu 
245 250 255 

Asn Leu Val Tyr Ala Ala Pro Gly Ser His Phe Asp Pro Thr Lys Leu 
260 265 270 

Lys lie Arg Leu Ala Glu Glu lie Asp Leu Glu Lys Leu Tyr Val Glu 
275 280 285 

Ala Val Lys Lys Gly lie His lie Gly Arg Pro Ser lie Lys Asp Glu 
290 295 300 

Val Ala Leu Leu Ala Leu Leu Ser Lys Ser Asp Ala Gin Asn Thr Phe 
305 310 315 320 

Lys Asp Gin Cys Glu Asp Val lie Lys Lys Lys Cys Gly Asn Phe Lys 
325 330 335 

Glu His lie lie Leu Lys Asp Leu Cys Ser Asn Lys Thr He Thr Asp 
340 345 350 

Asn Pro Lys Glu Lys Cys Glu Glu Leu Asn Lys Glu Leu Thr Thr Arg 
355 360 365 

He Leu Thr Val Ser Lys Arg He Glu Lys Tyr Phe Ala Pro Ala Asn 
370 375 380 

Val Lys Glu He He Gly Trp His Met Leu His Thr Phe Leu Gly Glu 
385 390 395 400 

Arg Glu Cys Thr Lys Leu Leu Ser Asp Cys Phe Tyr Leu Lys Ser Gin 
405 410 415 

Ala Pro Leu Glu Lys Pro Cys Asn Asn Leu Lys Ala Ala Cys Tyr Lys 
420 425 430 

Lys Gly Leu Glu Ala Val Ala Asn Glu Ala Leu Gin Asp Lys Leu Arg 
435 440 445 

Gly Lys Leu Gin Gly Ser Asn Arg Thr Trp Leu Glu Thr Leu Gin Lys 
450 455 460 

Asn Leu Val Lys Val Cys Glu Lys Thr Lys Gly Glu Ser Asp Glu Leu 
465 470 475 480 

Phe Val Leu Cys Met Asn Pro He Lys Thr Ala Leu Thr Val Ser Thr 
485 490 495 

Asp Leu Arg Met Arg Ala Val Ala Leu Gin Glu His Leu Asn Glu Lys 
500 505 510 

Arg Asp Phe Pro Thr Glu Lys Asp Cys Lys Glu Leu Glu Lys Lys Cys 
515 520 525 

Glu Val Leu Gly Lys Asp Ser Arg Glu He Lys Trp Ser Cys Tyr Thr 
530 535 540 

Leu Lys Gin His Cys Asn Arg Leu Lys Ser He Glu His Leu Glu Glu 
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545 550 555 560 

Glu Leu Leu Lys Giu Asn Lys Gly Tyr Leu Lys Asp Glu Asn Ser Cys 
565 570 575 

Lys Glu Glu Ala Lys Lys Arg Cys Glu Lys Trp Phe Arg Arg Glu Asn 
580 585 590 

Asn Lys Phe Phe Ser Ala Cys Ser Asp Leu Glu Leu Val Cys Lys Lys 
595 600 605 

lie Thr Arg Asn Val Glu Ser Lys Cys Asn He Leu Lys Gly His Met 
610 615 620 

Glu Thr Met Asn Val He Ser Glu He Ala Lys Lys Glu Glu Lys He 
625 630 635 640 

Cys Glu Phe Trp Ala Pro Tyr Cys Lys Lys Tyr Glu Gin Asn Cys Glu 
645 650 655 

Lys Leu Lys Asn Gly Gly Lys Asp Gly Gin Cys Lys Lys Leu Asn Lys 
660 665 670 

Lvs Cys Lys Ser Phe Leu Glu Lys Glu Ala Leu Glu Asn Lys Val Val 
675 680 685 

Glu Glu Leu Lys Gly Ser Leu Ser Asn Val Gly Glu Cys Asn Asn Thr 
690 695 700 

Leu Asn He Tyr Cys Thr Gin Leu Lys Lys Ala Glu Asn Gly Leu Glu 
705 710 715 720 

Thr Leu Cys Lys Ser Lys Glu Asn Thr Lys Ser Asp He Lys Val Arg 
725 730 735 

Glu Glu Leu Cys Glu Lys Leu He Lys Arg He Lys Glu Lys Cys Ser 
740 745 750 

Lys Leu Lys Asp Glu Leu Glu Glu Val Lys Glu Val Leu Glu Lys Lys 
755 760 765 

Glu Glu Lys Tyr Lys Lys He Lys Glu Glu Ala Glu Lys Ala Met Glu 
770 775 780 

Asp Ala Asn Leu He Leu Ser Arg Ala Lys Gly Pro Asp Asn Asn Asn 
785 790 795 800 

Asn Lys Ser Val Asn Lys Asp Ser Ser Asp Thr Pro Lys Glu Gly Lys 
805 810 815 

Gly Thr Thr Gly Phe Lys Leu Val Arg Arg Asn Ala Lys Val His Val 
820 825 830 

Thr Glu Lys Glu Leu Ala Ala Phe Asp Leu Val Ala Arg Ala Phe Asp 
835 840 845 

Leu Tyr Leu Glu Leu Lys Glu He Cys Asn His Ser Leu Lys Asn Cys 
850 855 860 

Glv Phe Lys Lys Glu Cys Asp Cys Glu Asp Pro Cys Lys Lys He Gin 
865 870 875 880 
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Gly lie Cys Ser 



Val Thr Lys Asn 
900 



He Lys Asp Ala 
915 

Thr Trp Val Thr 
930 

Ser Thr Val Thr 
945 

Pro Thr Lys Cys 



Pro Ser Glu Gly 
980 

Leu Leu Ala Met 
995 



Thr Leu Glu Pro 
885 

He Thr Thr Thr 



Lys Ala Thr Asp 
920 

Lys Thr Ser Thr 
935 

Ser Arg He Thr 
950 

Thr Thr Gly Glu 
965 

Leu Arg Met Ser 



Thr He Ser Phe 
1000 



Leu Lys Val Arg 
8 90 

Thr Thr Thr Thr 
905 

Cys His Ser Leu 



His Thr Ser Thr 
940 



Leu Thr Ser Thr 
955 

Glu Asp Glu Ala 
970 

Gly Trp Ser Val 
985 

Met He 



Pro His Glu He 
895 

Thr Thr Thr Thr 
910 

Gin Thr Thr Asp 
925 

Ser Thr Thr Thr 



Arg Arg Cys Lys 
960 



Gly Asp Val Lys 
975 

Met Arg Gly Val 
990 



<210> 5 
<211> 3090 

<212> DNA . . 

<213> Pneumocystis carinii sp. f- hominis 

<220> 

<221> CDS 

<222> (1) - - (3090) 

^gcg egg gcg «tc aag egg egg gea aaa ggt gea cag aat age att 4 8 



ata aca egg y<~y y*- 1 - ^ — -> - ^ _ T i« 

Me? Ala Arg 111 Val Lys Arg Arg Ala Lys Gly Ala Gin Asn Ser He 



1 



5 



96 



gat gag gag cat gtt tta get ttg att tta aaa aaa aat gga tta gaa 
Asp Glu Glu His Val Leu Ala Leu He Leu Lys Lys Asn Gly Leu Glu 
20 25 3° 

gat aca aaa tgc aaa act aag ttg gaa gaa tat tgc aaa aca tta aca 144 
Aso Thr Lys Cys Lys Thr Lys Leu Glu Glu Tyr Cys Lys Thr Leu Thr 
35 40 45 

aat gea gga tta aat cca gaa aaa gtt cac gaa aaa tta aaa gat ttc 192 
Asn Ala Gly Leu Asn Pro Glu Lys Val His Glu Lys Leu Lys Asp Phe 
50 55 60 

tgt gat aac ggg aaa cga aat gaa aaa tgt caa gat eta aaa aac aaa 240 
Cys Asp Asn Gly Lys Arg Asn Glu Lys Cys Gin Asp Leu Lys Asn Lys 
65 ™ 75 BU 

ate aat caa aaa tgc att aaa ttt caa gga aaa ctt caa aca get get 288 
Val Asn Gin Lys Cys He Lys Phe Gin Gly Lys Leu Gin Thr Ala Ala 
85 9° 95 



aqa aaa aaa att tea gaa tta aca gat gag gat tgc aaa aag aat gaa 

16 



336 



SUBSTITUTE SHEET (RULE 26) 



WO 00/09760 




PCT/US99/18750 



Gly Lys Lys lie Ser Glu Leu Thr Asp Glu Asp Cys Lys Lys Asn Glu 
100 10*5 110 



caa caa tgc eta ttt ttg gag gga gca tgt cca aca gaa ctt aaa gat 384 
Gin Gin Cys Leu Phe Leu Glu Gly Ala Cys Pro Thr Glu Leu Lys Asp 
115 120 125 



gac tgc aat aaa tta agg aat aac tgt tat caa aaa gaa egg aac aat 
Asp Cys Asn Lys Leu Arg Asn Asn Cys Tyr Gin Lys Glu Arg Asn Asn 
130 135 140 



tgc gta agt ctt gta aca aaa gga aaa agt aaa tgt gat act ctt gaa 
Cys Val Ser Leu Val Thr Lys Gly Lys Ser Lys Cys Asp Thr Leu Glu 
195 200 205 



aaa gec ctt aaa aaa aac tgc gaa aac cct cat gaa cat gag gec tta 
Lys Ala Leu Lys Lys Asn Cys Glu Asn Pro His Glu His Glu Ala Leu 

17 



432 



gtg gca gaa gaa gtt ctt ttg agg gcg ctt cgt ggt gat etc aat gaa 480 
Val Ala Glu Glu Val Leu Leu Arg Ala Leu Arg Gly Asp Leu Asn Glu 
145 150 155 160 

aca aag aca tgt gaa aaa aag ctg aaa gaa gtt tgc ccg aaa tta gaa 528 
Thr Lys Thr Cys Glu Lys Lys Leu Lys Glu Val Cys Pro Lys Leu Glu 
165 170 175 

aga gaa age gat gaa tta acg gag ctt tgt ctt tat caa aaa aca aca 576 
Arg Glu Ser Asp Glu Leu Thr Glu Leu Cys Leu Tyr Gin Lys Thr Thr 
180 185 190 



624 



aaa gaa gtt gaa gaa gca ctt aag aag aat gaa ttg cga gaa aaa tgt 672 
Lys Glu Val Glu Glu Ala Leu Lys Lys Asn Glu Leu Arg Glu Lys Cys 
210 215 220 

eta eta tta ctt gag caa tgt tac ttt cac aga ggg aac tgt gaa gga 720 
Leu Leu Leu Leu Glu Gin Cys Tyr Phe His Arg Gly Asn Cys Glu Gly 
225 230 235 240 

gac aaa tea aag tgc aat aaa cct aat aat aaa gac tgc aaa gaa tat 768 
Asp Lys Ser Lys Cys Asn Lys Pro Asn Asn Lys Asp Cys Lys Glu Tyr 
245 250 255 

gta cca gag tgt gat gaa tta gca gaa aag tgt gga aaa gaa aat att 816 
Val Pro Glu Cys Asp Glu Leu Ala Glu Lys Cys Gly Lys Glu Asn lie 
260 265 270 

gtt tat atg cat cca gga tec gat ttc gat cca act aag cca gag cct 864 
Val Tyr Met His Pro Gly Ser Asp Phe Asp Pro Thr Lys Pro Glu Pro 
275 280 285 

aca eta gca gag gac ata ggg ctg gaa gag ctt tat aag agg gca gaa 912 
Thr Leu Ala Glu Asp lie Gly Leu Glu Glu Leu Tyr Lys Arg Ala Glu 
290 295 300 

gag gat gga att ttt gtt gga aga caa cat gta aga gat gca aca get 960 
Glu Asp Gly He Phe Val Gly Arg Gin His Val Arg Asp Ala Thr Ala 
305 310 315 320 

ttg ttg gca eta ctt ctt aag aaa acc ctt aaa aaa gaa gaa tgt ata 1008 
Leu Leu Ala Leu Leu Leu Lys Lys Thr Leu Lys Lys Glu Glu Cys He 
325 330 335 



1056 
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340 345 350 

gaa aat eta tgt aag gaa aat aaa cca agt agt gat gga acg aaa aaa 1104 

Glu Asn Leu Cys Lys Glu Asn Lys Pro Ser Ser Asp Gly Thr Lys Lys 
355 360 " 365 

tgt gat gaa eta gaa aaa gat gtt aac aaa act tgt aca agt ctt aca 1152 

Cys Asp Glu Leu Glu Lys Asp Val Asn Lys Thr Cys Thr Ser Leu Thr 
370 375 380 

tea aca att ctt aaa aac cgt ctt tac att tea cct gat gga att gcg 1200 

Ser Thr lie Leu Lys Asn Arg Leu Tyr lie Ser Pro Asp Gly lie Ala 
385 390 395 400 

gaa tgg gga aaa tta ccg aca ttt ctt agt gat gaa gat tgt gca aaa 1248 

Glu Trp Gly Lys Leu Pro Thr Phe Leu Ser Asp Glu Asp Cys Ala Lys 

405 410 415 

eta gaa tct tat tgc ttt tat tat aaa gaa act tgt cca gat gtc aaa 1296 

Leu Glu Ser Tyr Cys Phe Tyr Tyr Lys Glu Thr Cys Pro Asp Val Lys 

420 425 430 

gaa get tgt atg aat gtg agg gca gcg tgt tat aag aga ggg ctt gat 1344 

Glu Ala Cys Met Asn Val Arg Ala Ala Cys Tyr Lys Arg Gly Leu Asp 
435 440 445 

gca egg gca aac agt gtg ttg caa aaa aat atg cga ggg tta ttg cat 1392 

Ala Arg Ala Asn Ser Val Leu Gin Lys Asn Met Arg Gly Leu Leu His 
450 455 460 

ggc tea aat aaa gat tgg ctt aag aaa ttt caa caa gaa tta gca aaa 1440 

Gly Ser Asn Lys Asp Trp Leu Lys Lys Phe Gin Gin Glu Leu Ala Lys 
465 470 475 480 

gta tgt gag aaa ctg aaa gga aat aaa gga agt ttc teg aac gat gaa 1488 

Val Cys Glu Lys Leu Lys Gly Asn Lys Gly Ser Phe Ser Asn Asp Glu 

485 490 495 

ttg ttt gtt ctg tgt ata caa cca gca aag gca gca cga tta ctt aca 1536 

Leu Phe Val Leu Cys lie Gin Pro Ala Lys Ala Ala Arg Leu Leu Thr 

500 505 510 

cat cac cat caa atg aga gtt ate ttt tta cga caa caa ctg gat caa 1584 

His His His Gin Met Arg Val lie Phe Leu Arg Gin Gin Leu Asp Gin 
515 520 525 

aag aga gat ttt ccg aca gat aaa gac tgc aag gaa tta ggg aga aaa 1632 

Lys Arg Asp Phe Pro Thr Asp Lys Asp Cys Lys Glu Leu Gly Arg Lys 
530 535 540 

tgc caa gat tta gga aag gat tea aaa gaa att aca tgg cca tgt cat 1680 

Cys Gin Asp Leu Gly Lys Asp Ser Lys Glu lie Thr Trp Pro Cys His 
545 550 555 560 

aca eta gaa cag caa tgc aat cgc tta ggg att aca gaa att tta aaa 1728 

Thr Leu Glu Gin Gin Cys Asn Arg Leu Gly lie Thr Glu lie Leu Lys 

565 570 575 

cag att tta ttg gat gaa cac aaa gat act ttg aaa agt cat gaa aac 1776 

Gin lie Leu Leu Asp Glu His Lys Asp Thr Leu Lys Ser His Glu Asn 

580 585 590 

18 



SUBSTITUTE SHEET (RULE 26) 



WO 00/09760 
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tgt gca aaa tat tta aaa aga aaa tgc cat aaa tgg tct aga agg ggt 1824 
Cys Ala Lys Tyr Leu Lys Arg Lys Cys His Lys Trp Ser Arg Arg Gly 
595 600 605 

gat gat cgt ttt tct ttt gta tgt gtt ttc caa aac get aca tgt gag 1872 
Asp Asp Arg Phe Ser Phe Val Cys Val Phe Gin Asn Ala Thr Cys Glu 
610 615 620 

ctg atg gta aaa gac gtg caa gat agg tgc aaa ata ttc gaa gaa aat 1920 
Leu Met Val Lys Asp Val Gin Asp Arg Cys Lys lie Phe Glu Glu Asn 
625 630 635 640 

atg caa gca tea gat att aat gat tec ctt aaa aaa aat caa ata aaa 1968 
Met Gin Ala Ser Asp lie Asn Asp Ser Leu Lys Lys Asn Gin lie Lys 
645 650 655 

gca gaa tea gca gca aat att tgt ccc tea tgg cat cca tac tgc gat 2016 
Ala Glu Ser Ala Ala Asn lie Cys Pro Ser Trp His Pro Tyr Cys Asp 
660 665 670 

aga ttt tta ccc aat tgt cct gat ctt aag aaa gga aaa act ttc tgt 2064 
Arg Phe Leu Pro Asn Cys Pro Asp Leu Lys Lys Gly Lys Thr Phe Cys 
675 680 685 

caa aat ctt aaa aaa tat tgc gaa cca ttc tac aaa aga aag gtt tta 2112 
Gin Asn Leu Lys Lys Tyr Cys Glu Pro Phe Tyr Lys Arg Lys Val Leu 
690 695 700 

gaa gat get ctt aaa gta gag ctt cga gga aat tta agt aat ata act 2160 
Glu Asp Ala Leu Lys Val Glu Leu Arg Gly Asn Leu Ser Asn lie Thr 
705 710 715 720 

aaa tgt gaa cct gca tta gaa aga tat tgt aca gta ttg aaa gac gta 2208 
Lys Cys Glu Pro Ala Leu Glu Arg Tyr Cys Thr Val Leu Lys Asp Val 
725 730 735 

aat aat gcg tea ate age agt tta tgt aaa gat aat acc gaa agt aaa 2256 
Asn Asn Ala Ser lie Ser Ser Leu Cys Lys Asp Asn Thr Glu Ser Lys 
740 745 750 

act aaa aag gec gat aat aaa aat gtt aga aag aag ctt tgt eta aaa 2304 
Thr Lys Lys Ala Asp Asn Lys Asn Val Arg Lys Lys Leu Cys Leu Lys 
755 760 765 

tta gtg gaa gag gtg gaa cag caa tgc aaa gta tta cca aca gaa tta 2352 
Leu Val Glu Glu Val Glu Gin Gin Cys Lys Val Leu Pro Thr Glu Leu 
770 775 780 

aca gag ctg gaa aaa agt eta aaa aaa gat gtt aag aca tat gag gaa 2400 
Thr Glu Leu Glu Lys Ser Leu Lys Lys Asp Val Lys Thr Tyr Glu Glu 
785 790 795 800 

ctt aag gaa agg gca aaa aaa gca atg aac aag tec age ctt gtt tta 2448 
Leu Lys Glu Arg Ala Lys Lys Ala Met Asn Lys Ser Ser Leu Val Leu 
805 810 815 

tea ctt gtt aag aaa aac gaa agt aat aca teg aaa aat aat age aaa 24 96 
Ser Leu Val Lys Lys Asn Glu Ser Asn Thr Ser Lys Asn Asn Ser Lys 
820 825 830 
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aac aag gar aag aat gtc gtt tea aac gga ctt caa gat acc aca aaa 2544 
Asn Lys Asp Lys Asn Val Val Ser Asn Gly Leu Gin Asd Thr Thr Lys 
835 840 845 

tat gtg aaa ata eta cga aga gga gtt aag gag gca ctt gta aca gaa 2592 

Tyr Val Lys lie Leu Arg Arg Gly Val Lys Glu Ala Leu Val Thr Glu 
850 855 860 

tct gaa gec aag gca ttt gat ttg gca gca gaa gtg ttt gga aga tat 2640 

Ser Glu Ala Lys Ala Phe Asp Leu Ala Ala Glu Val Phe Gly Arg Tyr 

865 870 875 880 

gta gac ttg aaa gaa aaa tgt gag aaa ttg act teg gat tgc ggg att 2688 

Val Asp Leu Lys Glu Lys Cys Glu Lys Leu Thr Ser Asp Cys Gly lie 
885 890 895 

aaa gac gat tgc gat ggt tta aaa gaa gtg tgt gga aag att gag aag 2736 

Lys Asp Asp Cys Asp Gly Leu Lys Glu Val Cys Gly Lys lie Glu Lys 
900 905 910 

aca tgt cac gat ctg aag cct ctg gag gtg aag teg cat gaa ata gtc 2784 

Thr Cys His Asp Leu Lys Pro Leu Glu Val Lys Ser His Glu lie Val 
915 920 925 

aca gaa age aca acg acg acc aca acg aca aca acg acc gtt acc gat 2832 

Thr Glu Ser Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Val Thr Asp 
930 935 940 

ccg aag gca aca gaa tgc aaa tec tta cag aca aca gat aca tgg gtt 2880 

Pro Lys Ala Thr Glu Cys Lys Ser Leu Gin Thr Thr Asp Thr Trp Val 

945 950 955 960 

aca cag aca teg aca cac aca age acg tct acc ate aca tct acc ate 2928 

Thr Gin Thr Ser Thr His Thr Ser Thr Ser Thr lie Thr Ser Thr lie 
965 970 975 

aca tea aaa ata aca ttg aca tea acg agg cga tgc aaa cca acc aag 2976 

Thr Ser Lys lie Thr Leu Thr Ser Thr Arg Arg Cys Lys Pro Thr Lys 
980 985 990 

tgt acg aca ggg gat gaa gca gga gac gtg aaa ccg agt gag gga ttg 3024 

Cys Thr Thr Gly Asp Glu Ala Gly Asp Val Lys Pro Ser Glu Gly Leu 
995 1000 1005 

aag atg agt ggg tgg age gtg atg agg ggg gtg ata gta gca atg gtt 3072 

Lys Met Ser Gly Trp Ser Val Met Arg Gly Val lie Val Ala Met Val 
1010 1015 1020 

att teg ttc atg att tag 3090 
He Ser Phe Met He 
1025 1030 



<210> 6 
<211> 1029 
<212> PRT 

<213> Pneumocystis carinii sp. f. hominis 
<400> 6 

Met Ala Arg Ala Val Lys Arg Arg Ala Lys Gly Ala Gin Asn Ser He 
15 10 15 
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Asp Glu Glu His Val Leu Ala Leu lie Leu Lys Lys Asn Gly Leu Glu 
20 25 30 

Asp Thr Lys Cys Lys Thr Lys Leu Glu Glu Tyr Cys Lys Thr Leu Thr 
35 40 45 

Asn Ala Gly Leu Asn Pro Glu Lys Val His Glu Lys Leu Lys Asp Phe 
50 55 60 

Cys Asp Asn Gly Lys Arg Asn Glu Lys Cys Gin Asp Leu Lys Asn Lys 
65 70 75 80 

Val Asn Gin Lys Cys lie Lys Phe Gin Gly Lys Leu Gin Thr Ala Ala 
85 90 95 

Gly Lys Lys lie Ser Glu Leu Thr Asp Glu Asp Cys Lys Lys Asn Glu 
100 105 110 

Gin Gin Cys Leu Phe Leu Glu Gly Ala Cys Pro Thr Glu Leu Lys Asp 
115 120 125 

Asp Cys Asn Lys Leu Arg Asn Asn Cys Tyr Gin Lys Glu Arg Asn Asn 
130 135 140 

Val Ala Glu Glu Val Leu Leu Arg Ala Leu Arg Gly Asp Leu Asn Glu 
145 150 155 160 

Thr Lys Thr Cys Glu Lys Lys Leu Lys Glu Val Cys Pro Lys Leu Glu 
165 170 175 

Arg Glu Ser Asp Glu Leu Thr Glu Leu Cys Leu Tyr Gin Lys Thr Thr 
180 185 190 

Cys Val Ser Leu Val Thr Lys Gly Lys Ser Lys Cys Asp Thr Leu Glu 
195 200 205 

Lys Glu Val Glu Glu Ala Leu Lys Lys Asn Glu Leu Arg Glu Lys Cys 
210 215 220 

Leu Leu Leu Leu Glu Gin Cys Tyr Phe His Arg Gly Asn Cys Glu Gly 
225 230 235 240 

Asp Lys Ser Lys Cys Asn Lys Pro Asn Asn Lys Asp Cys Lys Glu Tyr 
245 250 255 

Val Pro Glu Cys Asp Glu Leu Ala Glu Lys Cys Gly Lys Glu Asn lie 
260 265 270 

Val Tyr Met His Pro Gly Ser Asp Phe Asp Pro Thr Lys Pro Glu Pro 
275 280 285 

Thr Leu Ala Glu Asp lie Gly Leu Glu Glu Leu Tyr Lys Arg Ala Glu 
290 295 300 

Glu Asp Gly lie Phe Val Gly Arg Gin His Val Arg Asp Ala Thr Ala 
305 310 315 320 

Leu Leu Ala Leu Leu Leu Lys Lys Thr Leu Lys Lys Glu Glu Cys lie 



325 



330 



335 
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Lys Ala Leu Lys Lys Asn Cys Glu Asn Pro His Glu His Glu Aia Leu 
340 ' 345 350 

Glu Asn Leu Cys Lys Glu Asn Lys Pro Ser Ser Asp Gly Thr Lys Lys 
355 360 365 

Cys Asp Glu Leu Glu Lys Asp Val Asn Lys Thr Cys Thr Ser Leu Thr 
370 375 380 

Ser Thr lie Leu Lys Asn Arg Leu Tyr lie Ser Pro Asp Gly lie Ala 
385 390 395 400 

Glu Trp Gly Lys Leu Pro Thr Phe Leu Ser Asp Glu Asp Cys Ala Lys 
405 410 415 

Leu Glu Ser Tyr Cys Phe Tyr Tyr Lys Glu Thr Cys Pro Asp Val Lys 
420 425 430 

Glu Ala Cys Met Asn Val Arg Ala Ala Cys Tyr Lys Arg Gly Leu Asp 
435 440 445 

Ala Arg Ala Asn Ser Val Leu Gin Lys Asn Met Arg Gly Leu Leu His 
450 455 460 

Gly Ser Asn Lys Asp Trp Leu Lys Lys Phe Gin Gin Glu Leu Ala Lys 
465 470 475 480 

Val Cys Glu Lys Leu Lys Gly Asn Lys Gly Ser Phe Ser Asn Asp Glu 
485 490 495 

Leu Phe Val Leu Cys lie Gin Pro Ala Lys Ala Ala Arg Leu Leu Thr 
500 505 510 

His His His Gin Met Arg Val lie Phe Leu Arg Gin Gin Leu Asp Gin 
515 520 525 

Lys Arg Asp Phe Pro Thr Asp Lys Asp Cys Lys Glu Leu Gly Arg Lys 
530 535 540 

Cys Gin Asp Leu Gly Lys Asp Ser Lys Glu lie Thr Trp Pro Cys His 
545 550 555 560 

Thr Leu Glu Gin Gin Cys Asn Arg Leu Gly lie Thr Glu lie Leu Lys 
565 570 575 

Gin lie Leu Leu Asp Glu His Lys Asp Thr Leu Lys Ser His Glu Asn 
580 585 590 

Cys Ala Lys Tyr Leu Lys Arg Lys Cys His Lys Trp Ser Arg Arg Gly 
595 600 605 

Asp Asp Arg Phe Ser Phe Val Cys Val Phe Gin Asn Ala Thr Cys Glu 
610 615 620 

Leu Met Val Lys Asp Val Gin Asp Arg Cys Lys lie Phe Glu Glu Asn 
625 630 635 640 

Met Gin Ala Ser Asp lie Asn Asp Ser Leu Lys Lys Asn Gin lie Lys 
645 650 655 

Ala Glu Ser Ala Ala Asn lie Cys Pro Ser Trp His Pro Tyr Cys Asp 
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660 665 670 

Arg Phe Leu Pro Asn Cys Pro Asp Leu Lys Lys Gly Lys Thr Phe Cys 
675 680 685 

Gin Asn Leu Lys Lys Tyr Cys Glu Pro Phe Tyr Lys Arg Lys Val Leu 
690 * 695 700 

Glu Asp Ala Leu Lys Val Glu Leu Arg Gly Asn Leu Ser Asn lie Thr 
705 710 715 720 

Lys Cys Glu Pro Ala Leu Glu Arg Tyr Cys Thr Val Leu Lys Asp Val 
725 730 735 

Asn Asn Ala Ser lie Ser Ser Leu Cys Lys Asp Asn Thr Glu Ser Lys 
740 745 750 

Thr Lys Lys Ala Asp Asn Lys Asn Val Arg Lys Lys Leu Cys Leu Lys 
755 760 765 

Leu Val Glu Glu Val Glu Gin Gin Cys Lys Val Leu Pro Thr Glu Leu 
770 775 780 

Thr Glu Leu Glu Lys Ser Leu Lys Lys Asp Val Lys Thr Tyr Glu Glu 
785 790 795 800 

Leu Lys Glu Arg Ala Lys Lys Ala Met Asn Lys Ser Ser Leu Val Leu 
805 810 815 

Ser Leu Val Lys Lys Asn Glu Ser Asn Thr Ser Lys Asn Asn Ser Lys 
820 825 830 

Asn Lys Asp Lys Asn Val Val Ser Asn Gly Leu Gin Asp Thr Thr Lys 
835 840 845 

Tyr Val Lys lie Leu Arg Arg Gly Val Lys Glu Ala Leu Val Thr Glu 
850 855 860 

Ser Glu Ala Lys Ala Phe Asp Leu Ala Ala Glu Val Phe Gly Arg Tyr 
865 870 875 880 

Val Asp Leu Lys Glu Lys Cys Glu Lys Leu Thr Ser Asp Cys Gly lie 
885 890 895 

Lys Asp Asp Cys Asp Gly Leu Lys Glu Val Cys Gly Lys lie Glu Lys 
900 905 910 

Thr Cys His Asp Leu Lys Pro Leu Glu Val Lys Ser His Glu lie Val 
915 920 925 

Thr Glu Ser Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Val Thr Asp 
930 935 940 

Pro Lys Ala Thr Glu Cys Lys Ser Leu Gin Thr Thr Asp Thr Trp Val 
945 950 955 960 

Thr Gin Thr Ser Thr His Thr Ser Thr Ser Thr He Thr Ser Thr He 
965 970 975 

Thr Ser Lys He Thr Leu Thr Ser Thr Arg Arg Cys Lys Pro Thr Lys 
980 985 990 
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Cys Thr Thr Gly Asp Glu Ala Gly Asp Val Lys Pro Ser Glu Gly Leu 
995 1000 1005 

Lys Met Ser Gly Trp Ser Val Met Arg Gly Val lie Val Ala Met Val 
1010 .1015 1020 

lie Ser Phe Met He 
025 



<210> 7 
<211> 3084 
<212> DNA 

<213> Pneumocystis carinii sp. f. hominis 

<220> 

<221> CDS 

<222> (1) . . (3084) 

<400> 7 

atg gcg egg gcg gtc aag egg cag gca aaa ggt gca cag aat age att 4 8 

Met Ala Arg Ala Val Lys Arg Gin Ala Lys Gly Ala Gin Asn Ser He 
15 10 15 

gat gag gag cat gtt tta get ttg att tta aaa aaa aat gga tta gaa 96 
Asp Glu Glu His Val Leu Ala Leu He Leu Lys Lys Asn Gly Leu Glu 
20 25 30 

gat aca aaa tgc aaa act aag ttg gaa gaa tat tgc aaa aca tta aca 144 
Asp Thr Lys Cys Lys Thr Lys Leu Glu Glu Tyr Cys Lys Thr Leu Thr 
35 40 45 

aat gca gga tta aat cca gaa aaa gtt cac gaa aaa tta aaa gat ttc 192 
Asn Ala Gly Leu Asn Pro Glu Lys Val His Glu Lys Leu Lys Asp Phe 
50 55 60 

tgt gat aac ggg aaa cga aat gaa aaa tgt caa gat eta aaa aac aaa 240 
Cys Asp Asn Gly Lys Arg Asn Glu Lys Cys Gin Asp Leu Lys Asn Lys 
65 70 75 80 

gtc aat caa aaa tgc att aaa ttt caa gga aaa ctt caa aca get get 288 
Val Asn Gin Lys Cys He Lys Phe Gin Gly Lys Leu Gin Thr Ala Ala 
85 90 95 

aga aaa aaa att tea gaa tta aca gat gag gat tgc aaa aag aat gaa 336 
Arg Lys Lys He Ser Glu Leu Thr Asp Glu Asp Cys Lys Lys Asn Glu 
100 105 HO 

caa caa tgc eta ttt ttg gag gga gca tgt cca aca gaa ctt aaa gat 384 
Gin Gin Cys Leu Phe Leu Glu Gly Ala Cys Pro Thr Glu Leu Lys Asp 
115 120 125 

gac tgc aat aaa tta agg aat aac tgt tat caa aaa gaa egg aac aat 4 32 
Asp Cys Asn Lys Leu Arg Asn Asn Cys Tyr Gin Lys Glu Arg Asn Asn 
130 135 140 

gtg gca gaa gaa gtt ctt ttg agg gcg ctt cgt ggt gat etc aat gaa 4 80 
Val Ala Glu Glu Val Leu Leu Arg Ala Leu Arg Gly Asp Leu Asn Glu 
145 150 155 160 
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aca aag aca tgt gaa aaa aaa ctg aaa gaa gtt tgc ccg aaa tta gaa 528 

Thr Lys Thr Cys Glu Lys Lys Leu Lys Glu Val Cys Pro Lys Leu Glu 

165 * 170 175 

aga gaa age gat gaa tta acg gag ctt tgt ctt tat caa aaa aca aca 576 

Arg Glu Ser Asp Glu Leu Thr Glu Leu Cys Leu Tyr Gin Lys Thr Thr 

180 185 190 

tgc gta agt ctt gta aca aaa gga aaa agt aaa tgt gat act ctt gaa 624 

Cys Val Ser Leu Val Thr Lys Gly Lys Ser Lys Cys Asp Thr Leu Glu 

195 200 205 

aaa gaa gtt gaa gaa gca ctt aag aag aat gaa ttg cga gaa aaa tgt 672 

Lys Glu Val Glu Glu Ala Leu Lys Lys Asn Glu Leu Arg Glu Lys Cys 

210 215 220 

eta eta tta ctt gag caa tgt tac ttt cac aga ggg aac tgt gaa gga 720 

Leu Leu Leu Leu Glu Gin Cys Tyr Phe His Arg Gly Asn Cys Glu Gly 

225 230 235 240 

gac aaa tea aag tgc aat aaa cct aat aat aaa gac tgc aaa gaa tat 768 

Asp Lys Ser Lys Cys Asn Lys Pro Asn Asn Lys Asp Cys Lys Glu Tyr 

245 250 255 

gta cca gag tgt gat gaa tta gca gaa aag tgt gga aaa gaa aat att 816 

Val Pro Glu Cys Asp Glu Leu Ala Glu Lys Cys Gly Lys Glu Asn lie 

260 265 270 

gtt tat atg cat cca gga tec gat ttc gat cca act aag cca gag cct 864 

Val Tyr Met His Pro Gly Ser Asp Phe Asp Pro Thr Lys Pro Glu Pro 

275 280 285 

aca eta gca gag gac ata ggg ctg gaa gag ctt tat aag agg gca gaa 912 

Thr Leu Ala Glu Asp lie Gly Leu Glu Glu Leu Tyr Lys Arg Ala Glu 

290 295 300 

gag gat gga att ttt gtt gga aga caa cat gta aga gat gca aca get 960 

Glu Asp Gly lie Phe Val Gly Arg Gin His Val Arg Asp Ala Thr Ala 

305 310 315 320 

ttg ttg gca eta ctt ctt aag aaa acc ctt aaa aaa gaa gaa tgt ata 1008 

Leu Leu Ala Leu Leu Leu Lys Lys Thr Leu Lys Lys Glu Glu Cys lie 

325 330 335 

aaa gee ctt aaa aaa aac tgc gaa aac cct cat gaa cat gag gec tta 1056 

Lys Ala Leu Lys Lys Asn Cys Glu Asn Pro His Glu His Glu Ala Leu 

340 345 350 

gaa aat eta tgt aag gaa aat aaa cca agt agt gat gga acg aaa aaa 1104 

Glu Asn Leu Cys Lys Glu Asn Lys Pro Ser Ser Asp Gly Thr Lys Lys 

355 360 365 

tgt gat gaa eta gaa aaa gat gtt aac aaa act tgt aca agt ctt aca 1152 

Cys Asp Glu Leu Glu Lys Asp Val Asn Lys Thr Cys Thr Ser Leu Thr 

370 375 380 

tea aca att ctt aaa aac cgt ctt tac att tea cct gat gga att gcg 1200 

Ser Thr lie Leu Lys Asn Arg Leu Tyr lie Ser Pro Asp Gly lie Ala 

385 390 395 400 

gaa tgg gga aaa tta ccg aca ttt ctt agt gat gaa gat tgt gca aaa 1248 
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Glu Trp Gly Lys Leu Pro Thr Phe Leu Ser Asp GIu Asp Cys Ala Lys 



eta gaa tct tat tgc ttt tat tat aaa gaa act tgt cca gat gtc aaa 1296 
Leu Glu Ser Tyr Cys Phe Tyr Tyr Lys Glu Thr Cys Pro Asp Val Lys 
420 425 430 

gaa get tgt atg aat gtg agg gca gcg tgt tac aag aga ggg ctt gat 1344 
Glu Ala Cys Met Asn Val Arg Ala Ala Cys Tyr Lys Arg Gly Leu Asp 
435 440 445 

gca egg gca aac agt gtg ttg caa aaa aat atg cgt ggg tta tta cgt 1392 
Ala Arg Ala Asn Ser Val Leu Gin Lys Asn Met Arg Gly Leu Leu Arg 
450 455 460 

ggt tea aat caa agt tgg ctt aag gag ttt caa caa aga tta gta aaa 1440 
Gly Ser Asn Gin Ser Trp Leu Lys Glu Phe Gin Gin Arg Leu Val Lys 
465 470 475 480 

gta tgt aag gag eta aaa gaa aat aaa gga agt ttc cca aac gat gaa 1488 
Val Cys Lys Glu Leu Lys Glu Asn Lys Gly Ser Phe Pro Asn Asp Glu 
485 490 495 

ata ttt gtt ctg tgt gta cag cca gca aaa get gca cga tta ctt aca 1536 
lie Phe Val Leu Cys Val Gin Pro Ala Lys Ala Ala Arg Leu Leu Thr 
500 505 510 

cac gat cat caa atg agg gtt acc ttt tta cga caa caa ttg gat caa 1584 
His Asp His Gin Met Arg Val Thr Phe Leu Arg Gin Gin Leu Asp Gin 
515 520 525 

aag aga gat ttt ccg aca gat aaa gac tgc aag gaa eta ggg aaa aaa 1632 
Lys Arg Asp Phe Pro Thr Asp Lys Asp Cys Lys Glu Leu Gly Lys Lys 
530 535 540 

tgc caa gat tta gga aag gat tea aaa gaa att aca tgg cca tgt cat 1680 
Cys Gin Asp Leu Gly Lys Asp Ser Lys Glu lie Thr Trp Pro Cys His 
545 550 555 560 

aca ctg gag cag caa tgc aat cgc ttg ggg act aca gaa att tta aag 1728 
Thr Leu Glu Gin Gin Cys Asn Arg Leu Gly Thr Thr Glu lie Leu Lys 
565 570 575 

cag gtt tta ttg gat gaa cac aaa gat act ttg aaa gac caa gaa agt 1776 
Gin Val Leu Leu Asp Glu His Lys Asp Thr Leu Lys Asp Gin Glu Ser 
580 585 590 

tgt gta aaa tac eta aaa gaa aag tgt aat aaa tgg tct aga aga gga 1824 
Cys Val Lys Tyr Leu Lys Glu Lys Cys Asn Lys Trp Ser Arg Arg Gly 
595 600 605 

gat gac cgt ttc tct ttt gta tgt gtt ttc caa aac get acg tgt gag 1872 
Asp Asp Arg Phe Ser Phe Val Cys Val Phe Gin Asn Ala Thr Cys Glu 
610 615 620 

ctg atg gta aaa gac gtg aaa gac agg tgt gaa gta ttc aaa aaa aat 1920 
Leu Met Val Lys Asp Val Lys Asp Arg Cys Glu Val Phe Lys Lys Asn 
625 630 635 640 

ata aaa get tea tat att att gaa ttt ctt gaa aat aat aca aat aaa 1968 
lie Lys Ala Ser Tyr lie lie Glu Phe Leu Glu Asn Asn Thr Asn Lys 
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645 650 655 

ata aca aca ctg gaa aga aat tgt ccc tct tgg cat acg tat tgc aat 2016 
lie Thr Thr Leu Glu Arg Asn Cys Pro Ser Trp His Thr Tyr Cys Asn 
660 665 670 

aga ttt tea cct aat tgt cca ggc ctt acg aaa gag aat agt tgt aca 2064 
Arg Phe Ser Pro Asn Cys Pro Gly Leu Thr Lys Glu Asn Ser Cys Thr 
675 680 685 

aaa ate aag aag cat tgt gag ccg ttc tat aaa aga aag gec ttg gaa 2112 
Lys lie Lys Lys His Cys Glu Pro Phe Tyr Lys Arg Lys Ala Leu Glu 
690 695 700 

gat get etc aaa gta gag ctt caa gga aaa ttg act gat aaa tct aaa 2160 
Asp Ala Leu Lys Val Glu Leu Gin Gly Lys Leu Thr Asp Lys Ser Lys 
705 710 715 720 

tgt gaa cct gca ttg aac aga tat tgt aca gta gcg gga aac gta aat 2208 
Cys Glu Pro Ala Leu Asn Arg Tyr Cys Thr Val Ala Gly Asn Val Asn 
725 730 735 

aat gcg tea ate agt ggc tta tgc aaa get aac acc aag gat aac tct 2256 
Asn Ala Ser lie Ser Gly Leu Cys Lys Ala Asn Thr Lys Asp Asn Ser 
740 745 750 

gga aag agt gat gag gat get aga aag gaa etc tgt gag aaa tea gtg. 2304 
Gly Lys Ser Asp Glu Asp Ala Arg Lys Glu Leu Cys Glu Lys Ser Val 
755 760 765 

aaa gaa gtg gaa gaa cag tgc aaa gca tta cca aca gaa tta gga caa 2352 
Lys Glu Val Glu Glu Gin Cys Lys Ala Leu Pro Thr Glu Leu Gly Gin 
770 775 780 

ccg gca get gat eta aaa aaa gat tat aag aca tat gag gaa ctt aag 2400 
Pro Ala Ala Asp Leu Lys Lys Asp Tyr Lys Thr Tyr Glu Glu Leu Lys 
785 790 795 800 

aaa cgt gca gag gaa gca atg aac aag tec agt ctt gtt ttg tea etc 24 4 8 
Lys Arg Ala Glu Glu Ala Met Asn Lys Ser Ser Leu Val Leu Ser Leu 
805 810 815 

att aag aaa aac gaa agt aat gta tea aaa agt aat age aaa aac aag 24 96 
lie Lys Lys Asn Glu Ser Asn Val Ser Lys Ser Asn Ser Lys Asn Lys 
820 825 830 

gat aag aat gee gtt tea aac gga ctt caa gat acc aca aaa cat gtg 2544 
Asp Lys Asn Ala Val Ser Asn Gly Leu Gin Asp Thr Thr Lys His Val 
835 840 845 

aaa ata eta egg aga gga gtt aag gat gta tec gta aca gaa tta gaa 2592 
Lys lie Leu Arg Arg Gly Val Lys Asp Val Ser Val Thr Glu Leu Glu 
850 855 860 

get aaa gca ttt gat ttg gca gca gaa gta ttt gga aga tat gta gat 2640 
Ala Lys Ala Phe Asp Leu Ala Ala Glu Val Phe Gly Arg Tyr Val Asp 
865 870 875 880 

ttg aag gaa aga tgt aat aaa ttg gaa tea gat tgc aga att aag gag 2688 

Leu Lys Glu Arg Cys Asn Lys Leu Glu Ser Asp Cys Arg lie Lys Glu 
885 890 895 
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gat tgc aaa gac tta gaa gaa gta tgc aaa aag att aat aag get tgt 
Asp Cys Lys Asp Leu Glu Glu Val Cys Lys Lys lie Asn Lys Ala Cys 
900 905 910 



2736 



cgc aat ctg aag cct ctg gag gtg aag ccg cac gaa aca gtg aca gaa 
Arg Asn Leu Lys Pro Leu Glu Val Lys Pro His Glu Thr Val Thr Glu 
915 920 925 



2784 



ggt aca acg aca act aca aca aca aca aca acc gtt gec gat ccg aag 
Gly Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Val Ala Asp Pro Lys 
930 935 940 



2832 



gca acg gaa tgc aaa tec tta cag aca aca gac aca tgg gtt aca cag 
Ala Thr Glu Cys Lys Ser Leu Gin Thr Thr Asp Thr Trp Val Thr Gin 
945 950 955 960 



2880 



aca teg aca cac aca age acg tct act ate aca tct acc ate aca tea 
Thr Ser Thr His Thr Ser Thr Ser Thr lie Thr Ser Thr lie Thr Ser 
965 970 975 



2928 



aaa ata aca ttg aca tea acg agg cga tgc aaa cca acc aag tgt acg 
Lys lie Thr Leu Thr Ser Thr Arg Arg Cys Lys Pro Thr Lys Cys Thr 
980 985 990 



2976 



aca ggg gat gat gca gaa gac gtg aag cca agt gaa ggc ttg agg gtg 
Thr Gly Asp Asp Ala Glu Asp Val Lys Pro Ser Glu Gly Leu Arg Val 
995 1000 1005 



3024 



age ggg tgg aat gtg atg agg ggg gtg ata gta gca atg gtt att teg 
Ser Gly Trp Asn Val Met Arg Gly Val He Val Ala Met Val He Ser 
1010 1015 1020 



3072 



ttc atg att tag 
Phe Met He 
1025 



3084 



<210> 8 
<211> 1027 
<212> PRT 

<213> Pneumocystis carinii sp- f. hominis 
<400> 8 

Met Ala Arg Ala Val Lys Arg Gin Ala Lys Gly Ala Gin Asn Ser He 
15 10 15 

Asp Glu Glu His Val Leu Ala Leu He Leu Lys Lys Asn Gly Leu Glu 
20 25 30 

Asp Thr Lys Cys Lys Thr Lys Leu Glu Glu Tyr Cys Lys Thr Leu Thr 
35 40 45 

Asn Ala Gly Leu Asn Pro Glu Lys Val His Glu Lys Leu Lys Asp Phe 
50 55 60 

Cys Asp Asn Gly Lys Arg Asn Glu Lys Cys Gin Asp Leu Lys Asn Lys 
65 70 75 80 

Val Asn Gin Lys Cys He Lys Phe Gin Gly Lys Leu Gin Thr Ala Ala 
85 90 95 
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Arg Lys Lys lie Ser Glu Leu Thr Asp Glu Asp Cys Lys Lys Asn Glu 
100 105 110 

Gin Gin Cys Leu Phe Leu Glu Gly Ala Cys Pro Thr Glu Leu Lys Asp 
115 120 125 

Asp Cys Asn Lys Leu Arg Asn Asn Cys Tyr Gin Lys Glu Arg Asn Asn 
130 135 140 

Val Ala Glu Glu Val Leu Leu Arg Ala Leu Arg Gly Asp Leu Asn Glu 
145 150 155 160 

Thr Lys Thr Cys Glu Lys Lys Leu Lys Glu Val Cys Pro Lys Leu Glu 
165 170 175 

Arg Glu Ser Asp Glu Leu Thr Glu Leu Cys Leu Tyr Gin Lys Thr Thr 
180 185 190 

Cys Val Ser Leu Val Thr Lys Gly Lys Ser Lys Cys Asp Thr Leu Glu 
195 200 205 

Lys Glu Val Glu Glu Ala Leu Lys Lys Asn Glu Leu Arg Glu Lys Cys 
210 215 220 

Leu Leu Leu Leu Glu Gin Cys Tyr Phe His Arg Gly Asn Cys Glu Gly 
225 230 235 240 

Asp Lys Ser Lys Cys Asn Lys Pro Asn Asn Lys Asp Cys Lys Glu Tyr 
245 250 255 

Val Pro Glu Cys Asp Glu Leu Ala Glu Lys Cys Gly Lys Glu Asn lie 
260 265 270 

Val Tyr Met His Pro Gly Ser Asp Phe Asp Pro Thr Lys Pro Glu Pro 
275 280 285 

Thr Leu Ala Glu Asp lie Gly Leu Glu Glu Leu Tyr Lys Arg Ala Glu 
290 295 300 

Glu Asp Gly lie Phe Val Gly Arg Gin His Val Arg Asp Ala Thr Ala 
305 310 315 320 

Leu Leu Ala Leu Leu Leu Lys Lys Thr Leu Lys Lys Glu Glu Cys lie 
325 330 335 

Lys Ala Leu Lys Lys Asn Cys Glu Asn Pro His Glu His Glu Ala Leu 
340 345 350 

Glu Asn Leu Cys Lys Glu Asn Lys Pro Ser Ser Asp Gly Thr Lys Lys 
355 360 365 

Cys Asp Glu Leu Glu Lys Asp Val Asn Lys Thr Cys Thr Ser Leu Thr 
370 375 380 

Ser Thr lie Leu Lys Asn Arg Leu Tyr lie Ser Pro Asp Gly lie Ala 
385 390 395 400 

Glu Trp Gly Lys Leu Pro Thr Phe Leu Ser Asp Glu Asp Cys Ala Lys 



405 



410 



415 
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Leu Glu Ser Tvr Cvs Phe Tvr Tyr Lys Glu Thr Cys Pro Asp Val Lys 
420 425 430 

Glu Ala Cys Met Asn Val Arg Ala Ala Cys Tyr Lys Arg Gly Leu Asp 
435 440 445 

Ala Arg Ala Asn Ser Val Leu Gin Lys Asn Met Arg Gly Leu Leu Arg 
450 455 460 

Glv Ser Asn Gin Ser Trp Leu Lys Glu Phe Gin Gin Arg Leu Val Lys 
465 470 475 480 

Val Cys Lys Glu Leu Lys Glu Asn Lys Gly Ser Phe Pro Asn Asp Glu 
485 490 495 

He Phe Val Leu Cys Val Gin Pro Ala Lys Ala Ala Arg Leu Leu Thr 
500 505 510 

His Asp His Gin Met Arg Val Thr Phe Leu Arg Gin Gin Leu Asp Gin 
515 520 525 

Lvs Arg Asp Phe Pro Thr Asp Lys Asp Cys Lys Glu Leu Gly Lys Lys 
530 535 540 

Cvs Gin Asp Leu Gly Lys Asp Ser Lys Glu He Thr Trp Pro Cys His 
545 550 555 560 

Thr Leu Glu Gin Gin Cys Asn Arg Leu Gly Thr Thr Glu He Leu Lys 
565 570 575 

Gin Val Leu Leu Asp Glu His Lys Asp Thr Leu Lys Asp Gin Glu Ser 
580 585 590 

Cys Val Lys Tyr Leu Lys Glu Lys Cys Asn Lys Trp Ser Arg Arg Gly 
595 600 605 

Asp Asp Arg Phe Ser Phe Val Cys Val Phe Gin Asn Ala Thr Cys Glu 
610 615 620 

Leu Met Val Lys Asp Val Lys Asp Arg Cys Glu Val Phe Lys Lys Asn 
62 5 630 635 640 

He Lys Ala Ser Tyr He He Glu Phe Leu Glu Asn Asn Thr Asn Lys 
645 650 655 

He Thr Thr Leu Glu Arg Asn Cys Pro Ser Trp His Thr Tyr Cys Asn 
660 665 670 

Arg Phe Ser Pro Asn Cys Pro Gly Leu Thr Lys Glu Asn Ser Cys Thr 
675 680 685 

Lys He Lys Lys His Cys Glu Pro Phe Tyr Lys Arg Lys Ala Leu Glu 
690 695 700 

Asd Ala Leu Lys Val Glu Leu Gin Gly Lys Leu Thr Asp Lys Ser Lys 
705 710 715 720 

Cvs Glu Pro Ala Leu Asn Arg Tyr Cys Thr Val Ala Gly Asn Val Asn 
725 730 735 

Asn Ala Ser He Ser Gly Leu Cys Lys Ala Asn Thr Lys Asp Asn Ser 
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WO 00/09760 



740 



745 750 



Gly Lys Ser Asp Glu Asp Ala Arg Lys Glu Leu Cys Glu Lys Ser Val 
* 755 760 765 

Lys Glu Val Glu Glu Gin Cys Lys Ala Leu Pro Thr Glu Leu Gly Gin 
770 775 780 

Pro Ala Ala Asp Leu Lys Lys Asp Tyr Lys Thr Tyr Glu Glu Leu Lys 



785 



790 795 800 



Lys Arg Ala Glu Glu Ala Met Asn Lys Ser Ser Leu Val Leu Ser Leu 
y 805 810 815 

lie Lys Lys Asn Glu Ser Asn Val Ser Lys Ser Asn Ser Lys Asn Lys 
820 825 830 

ASP Lys Asn Ala Val Ser Asn Gly Leu Gin Asp Thr Thr Lys His Val 
P 835 840 845 

Lys lie Leu Arg Arg Gly Val Lys Asp Val Ser Val Thr Glu Leu Glu 
850 855 860 

Ala Lys Ala Phe Asp Leu Ala Ala Glu Val Phe Gly Arg Tyr Val Asp 

Leu Lys Glu Arg Cys Asn Lys Leu Glu Ser Asp Cys Arg He Lys Glu 
885 890 «95 

Asp cys Lys Asp Leu Glu Glu Val Cys Lys Lys He Asn Lys Ala Cys 



900 



Arg Asn Leu Lys Pro Leu Glu Val Lys Pro His Glu Thr Val Thr Glu 
915 920 925 

Glv Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Val Ala Asp Pro Lys 
930 935 940 

Ala Thr Glu Cys Lys Ser Leu Gin Thr Thr Asp Thr Trp Val Thr Gin 
945 950 955 

Thr Ser Thr His Thr Ser Thr Ser Thr He Thr Ser Thr He Thr Ser 
965 970 975 

Lys He Thr Leu Thr Ser Thr Arg Arg Cys Lys Pro Thr Lys Cys Thr 
Y 980 985 990 

Thr Gly Asp Asp Ala Glu Asp Val Lys Pro Ser Glu Gly Leu Arg Val 
995 1000 1005 

Ser Gly Trp Asn Val Met Arg Gly Val He Val Ala Met Val He Ser 
1010 1015 1020 

Phe Met He 
025 



<210> 9 
<211> 3081 

<212> DNA . . 

<213> Pneumocystis carinii sp. f. hominis 
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<220> 

<221> CDS 

<222> CD . . (3030) 



<400> 9 ^_ 

atg gcg egg gcg gtc aag egg cag get gca aaa gca tea ggg get agt 

Met Ala Arg Ala Val Lys Arg Gin Ala Ala Lys Ala Ser Gly Ala Ser 

5 10 15 



1 



gag ttt tgt gaa aat aaa aaa gca gat tea aaa tgc aaa gaa ctg aaa 
Glu Phe Cys Glu Asn Lys Lys Ala Asp Ser Lys Cys Lys Glu Leu Lys 
65 70 75 80 

gaa aaa etc act caa aaa tgt act gca ate aaa gga aaa ctt aca gaa. 
Glu Lys Leu Thr Gin Lys Cys Thr Ala lie Lys Gly Lys Leu Thr Glu 
85 90 95 

gca ate aaa aaa aaa aat tea gat tta acg gat gaa gat tgc aaa gag 
Ala He Lys Lys Lys Asn Ser Asp Leu Thr Asp Glu Asp Cys Lys Glu 
100 105 HO 

aat gaa caa caa tgc eta ttt ttg gag gga gca tgt cca gcg gaa ctt 
Asn Glu Gin Gin Cys Leu Phe Leu Glu Gly Ala Cys Pro Ala Glu Leu 
115 120 125 

aaa gat gat tgc aat act ttg aga aat aag tgc tat caa aag aag cgt 
Lys Asp Asp Cys Asn Thr Leu Arg Asn Lys Cys Tyr Gin Lys Lys Arg 
130 135 140 

gat aaa gtg gcg gaa gaa get ctt tta aga gca gtt cgt gga ggt eta 
Asp Lys Val Ala Glu Glu Ala Leu Leu Arg Ala Val Arg Gly Gly Leu 
14 5 150 155 160 

ate aat gaa act aca tgt gaa gga aag etc aaa gag gtt tgc ata gag 
He Asn Glu Thr Thr Cys Glu Gly Lys Leu Lys Glu Val Cys He Glu 
165 170 175 

ttg agt caa gaa agt gat gag tta acg aag ctt tgt ctt tat caa aaa 
Leu Ser Gin Glu Ser Asp Glu Leu Thr Lys Leu Cys Leu Tyr Gin Lys 
180 185 190 

atg acg tgc aaa aca ttt gta tta gaa aaa caa aaa aaa tgt aat get 
Met Thr Cys Lys Thr Phe Val Leu Glu Lys Gin Lys Lys Cys Asn Ala 
195 200 205 

ctt aaa cag gat gtt aac gca gca ctt gag aag aaa gat gag tta cga 
Leu Lys Gin Asp Val Asn Ala Ala Leu Glu Lys Lys Asp Glu Leu Arg 
210 215 220 

32 



48 



96 



144 



gta tat gat ggt gaa gaa att ctt ttg get tta att gca gga aaa aaa 
Val Tyr Asp Gly Glu Glu He Leu Leu Ala Leu He Ala Gly Lys Lys 
20 25 30 

tat aat gat aat gaa tgc aaa aaa gaa tta gaa aaa tat tgt aag aca 
Tyr Asn Asp Asn Glu Cys Lys Lys Glu Leu Glu Lys Tyr Cys Lys Thr 
35 40 45 

tta acg gat gca gaa tta aaa cca gaa aaa gtt cac aaa aaa ctt aag 192 
Leu Thr Asp Ala Glu Leu Lys Pro Glu Lys Val His Lys Lys Leu Lys 
50 55 60 



240 



288 



336 



384 



4 32 



480 



528 



576 



624 



672 
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aaa aaa tct tta cca ctg ctt gaa cga tgc tat ttt tat aga ggg aat 720 

Gly Lys Cys Leu Pro Leu Leu Glu Arg Cys Tyr Phe Tyr Arg Gly Asn 

225 230 235 



230 



tgt gaa gat ata tea aaa tgt aat aaa tea tec gaa gac tgt tat gaa 768 
Cys Glu Isv He Ser Lys Cys Asn Lys Ser Ser Glu Asp Cys Tyr Glu 
1 ' 245 250 25 

tat ttg cca gtg tgt gat aca ttg gca gtg aaa tgt gaa gaa aat aag 816 
Tvr Leu Pro Val Cys Asp Thr Leu Ala Val Lys Cys Glu Glu Asn Lys 

265 270 



260 



att att tat aca cat ccg gga tec gat ttc aat cca act aag tea aag 864 
lie He Tyr Thr His Pro Gly Ser Asp Phe Asn Pro Thr Lys Ser Lys 
275 280 285 

cct act gta gca gaa gac ata gga ctg gaa gag ctt tat aaa aag gee 912 
Pro Thr Val Ala Glu Asp He Gly Leu Glu Glu Leu Tyr Lys Lys Ala 
290 295 300 

gca gaa gaa ggt gtt cat att gga aag cct cct gta aga gat gca act 960 
lla Ilu Glu Gly Val His He Gly Lys Pro Pro Val Arg Asp Ala Thr 

315 



305 310 



get eta ctg gcg ctt ttg att eaa aat eta gat cct aag agt caa gtg 

Ala Leu Leu Ala Leu Leu He Gin Asn Leu Asp Pro Lys Ser Gin Val. 

325 330 335 

ggt aaa gaa tgc gaa aaa gtt ctt aaa gat aac tgt aaa gag tta aaa 

Gly Lys llu Cys Glu Lys Val Leu Lys Asp Asn Cys Lys Glu Leu Lys 
340 345 350 



1008 



1056 



1104 



agt cat gaa att ttg gga gat ttt tgt aat caa aat gta get ggt caa 
Ser His Glu He Leu Gly Asp Phe Cys Asn Gin Asn Val Ala Gly Gin 
355 360 365 

aat gaa att gaa aag tgt aaa gag tta gag aag gag tta gca aac agt 1152 
Asn Glu lie Glu Lys Cys Lys Glu Leu Glu Lys Glu Leu Ala Asn Ser 
370 375 380 

act aaa att ctt ttt gaa aaa ata aag aat aaa cac etc tct gga tec 1200 
Thr Lys He Leu Phe Glu Lys He Lys Asn Lys His Leu Ser Gly Ser 



385 390 



gga gaa gtc att cea tgg tat aag ttg acg aca ttt ctt agt gac aat 
Gly Glu Val He Pro Trp Tyr Lys Leu Thr Thr Phe Leu Ser Asp Asn 
3 410 415 



1248 



405 



qac tgc aca agg tta gag tea gac tgt ttt tat tta aaa agt caa gca 1296 
Asp Cys Thr Arg Leu Glu Ser Asp Cys Phe Tyr Leu Lys Ser Gin Ala 
420 425 

cct ctt gac aaa gaa tgt aat aat ctg aag gca gca tgt tat aag aga 1344 
Pro Leu Asp Lys Glu Cys Asn Asn Leu Lys Ala Ala Cys Tyr Lys Arg 

440 445 



435 



aaa ctt gaa gca caa get aat gaa gca ttg cag aaa aag atg tac gga 1392 
Gly Leu Glu Ala Gin Ala Asn Glu Ala Leu Gin Lys Lys Met Tyr Gly 

455 460 



4 50 
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ctg ttc tar ggz tea ggc aaa gaa tgg ttt aag aaa eta eta gaa aaa 1440 

Leu Phe Tyr Gly Ser Gly Lys Glu Trp Phe Lys Lys Leu Leu Glu Lys 

465 " 470 475 480 

ata atg gaa gaa tgt teg gaa ctt aaa aca aca age gat gag ttg ttt 1488 

lie Met Glu Glu Cys Ser Glu Leu Lys Thr Thr Ser Asp Glu Leu Phe 

485 490 " 495 

ttg eta tgt att gat cca ctt aaa gca gtc aga ata ctt gca get gat 1536 

Leu Leu Cys lie Asp Pro Leu Lys Ala Val Arg He Leu Ala Ala Asp 

500 505 510 

ate caa gca aga gca gtc ttt ttg egg aaa caa ttg gat caa aag cga 1584 

He Gin Ala Arg Ala Val Phe Leu Arg Lys Gin Leu Asp Gin Lys Arg 
515 520 525 

gac ttt cca aca gac aaa gat tgc aag gaa tta gga aga aag tgt gaa 1632 

Asp Phe Pro Thr Asp Lys Asp Cys Lys Glu Leu Gly Arg Lys Cys Glu 
530 535 540 

get tta ggg aag gat tea aat cag att aag tgg cca tgt cat acg eta 1680 

Ala Leu Gly Lys Asp Ser Asn Gin He Lys Trp Pro Cys His Thr Leu 

545 550 555 560 

aaa caa cag tgt gat cgc ttg ggg act aca gaa ate ttg aaa cag gtt 1728 

Lys Gin Gin Cys Asp Arg Leu Gly Thr Thr Glu He Leu Lys Gin Val 

565 570 575 

tta eta gat gaa cac aag gat act tta aga act cat gaa aac tgt acg 1776 

Leu Leu Asp Glu His Lys Asp Thr Leu Arg Thr His Glu Asn Cys Thr 

580 585 590 

aaa tat tta aag aga aaa tgt cat aaa tgg tct aga agg ggt gat gat 1824 

Lys Tyr Leu Lys Arg Lys Cys His Lys Trp Ser Arg Arg Gly Asp Asp 
595 600 605 

cgt ttc tct ttt gta tgt gtt tac caa aac get acg tgt aag ctg ata 1872 

Arg Phe Ser Phe Val Cys Val Tyr Gin Asn Ala Thr Cys Lys Leu He 
610 615 620 

gta gat gat gtg aaa gac agg tgt gaa gta ttt gaa aaa aat atg caa 1920 

Val Asp Asp Val Lys Asp Arg Cys Glu Val Phe Glu Lys Asn Met Gin 

625 630 635 640 

gcg tea gat att aat aat tct ctt aaa aat aaa caa ata aaa aca gaa 1968 

Ala Ser Asp He Asn Asn Ser Leu Lys Asn Lys Gin He Lys Thr Glu 

645 650 655 

tea gca gca aat att tgt ccc tea tgg cac cca tac tgc gat aga ttt 2016 

Ser Ala Ala Asn He Cys Pro Ser Trp His Pro Tyr Cys Asp Arg Phe 

660 665 670 

tta ccc aat tgt cct gat ctt aag aaa gga aaa act ttc tgt caa aat 2064 

Leu Pro Asn Cys Pro Asp Leu Lys Lys Gly Lys Thr Phe Cys Gin Asn 
675 680 685 

ctt aaa aaa tat tgc gaa cca ttc tac aaa agg aag gtt tta gaa gat 2112 

Leu Lys Lys Tyr Cys Glu Pro Phe Tyr Lys Arg Lys Val Leu Glu Asp 
690 695 700 

get ctt aaa gta gag ctt caa ggg aat tta agt aat aga aat aaa tgt 2160 
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Ala Leu Lys Val Glu Leu Gin Gly Asn Leu Ser Asn Arg Asn Lys Cys 
705 710 715 720 

gaa tct gca tta gaa aga tat tgc aca ata ttg aaa aat gta agt gat 
Glu Ser Ala Leu Glu Arg Tyr Cys Thr lie Leu Lys Asn Val Ser Asp 
725 730 735 

tea tea ate aac agt tta tgt aaa gat aat acc gaa agt aaa act aaa 
Ser Ser lie Asn Ser Leu Cys Lys Asp Asn Thr Glu Ser Lys Thr Lys 
740 745 750 

aag acc gat aat gaa gtt aga aag aag ctt tgt eta aaa tta gtg gaa 
Lys Thr Asp Asn Glu Val Arg Lys Lys Leu Cys Leu Lys Leu Val Glu 
755 760 765 

gag gtg gaa cag caa tgt aaa atg tta cca gca gaa ttg gag cat gag 
Glu Val Glu Gin Gin Cys Lys Met Leu Pro Ala Glu Leu Glu His Glu 
770 775 780 

gaa aaa gac eta aaa gat gat ttt gaa aca ttt gaa aaa ctt aaa aaa 
Glu Lys Asp Leu Lys Asp Asp Phe Glu Thr Phe Glu Lys Leu Lys Lys 
785 790 795 800 

cag gca gag aaa aca atg aat aaa tec aat ctt gtt tta tea ttc gtt 
Gin Ala Glu Lys Thr Met Asn Lys Ser Asn Leu Val Leu Ser Phe Val 
805 810 815 

aag aaa gat gaa aat aat aca teg aaa aat agt age aaa gac aag gat 
Lys Lys Asp Glu Asn Asn Thr Ser Lys Asn Ser Ser Lys Asp Lys Asp 
820 825 830 

aag aat acc gtt tea aac gga ctt caa gat acc aca gaa cat atg aaa 
Lvs Asn Thr Val Ser Asn Gly Leu Gin Asp Thr Thr Glu His Met Lys 

840 845 



2208 



2256 



2304 



2352 



2400 



2448 



2496 



2544 



835 



ata eta egg aga gga gtt aag gat gta tec gta aca gaa tct gaa get 
lie Leu Arg Arg Gly Val Lys Asp Val Ser Val Thr Glu Ser Glu Ala 
850 855 860 

aag gca ttt gat ttg gta gca gaa gta ttt gga aga tat eta gac ttg 
Lvs Ala Phe Asp Leu Val Ala Glu Val Phe Gly Arg Tyr Leu Asp Leu 
865 870 875 880 

aaa gaa aga tgt aat aaa ttg gaa tea gat tgc aga gtt aag gag gat 
Lvs Glu Arg Cys Asn Lys Leu Glu Ser Asp Cys Arg Val Lys Glu Asp 
885 890 895 

tgc aag gat tta gaa gga gta tgt gga aag ata caa gga gta tgt teg 
Cvs Lys Asp Leu Glu Gly Val Cys Gly Lys He Gin Gly Val Cys Ser 
900 905 910 

aaa tta aaa cca ctg aaa gtg aag ccg cac gaa aca gtg aca gaa age 
Lys Leu Lys Pro Leu Lys Val Lys Pro His Glu Thr Val Thr Glu Ser 
915 920 925 

aca acg acg acc acg acg aca aca acg acc gtt act gat ccg aag gca 
Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Val Thr Asp Pro Lys Ala 
930 935 940 

aca gaa tgc aaa tct tta cag aca aca gat aca tgg att aca cag act 
Thr Glu Cys Lys Ser Leu Gin Thr Thr Asp Thr Trp He Thr Gin Thr 

35 



2592 



2640 



2688 



2736 



2784 



2832 



2880 
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945 



950 955 960 



2928 



teg aca cat acc age acq tct ace ate aca tct aca ate aca tea aaa 
Ser Thr His Thr Ser Thr Ser Thr He Thr Ser Thr lie Thr Ser Lys 
965 970 975 

ata aca etc aca tea aca agg cgt tgc aaa cca acc aag tgt acg aca 
He Thr Lea Thr Ser Thr Arg Arg Cys Lys Pro Thr Lys Cys Thr Thr 
980 985 990 

ggg gat gat gca gag gac gtg aag ccg agt gag gga ttg aag atg agt 
Glv Asp Asp Ala Glu Asp Val Lys Pro Ser Glu Gly Leu Lys Met Ser 
995 1000 1005 

ggg tga aacgtgatga ggggggtgat agtagcaatg gttatttcgt tcatgattta g 3081 
Gly 

1010 



2976 



3024 



<210> 10 
<211> 1009 
<212> PRT 

<213> Pneumocystis carinii sp. f. hominis 

Met 0 Ala°Arg Ala Val Lys Arg Gin Ala Ala Lys Ala Ser Gly Ala Ser 
1 5 10 15 

Val Tyr Asp Gly Glu Glu lie Leu Leu Ala Leu He Ala Gly Lys Lys 
20 25 30 

Tyr Asn Asp Asn Glu Cys Lys Lys Glu Leu Glu Lys Tyr Cys Lys Thr 
35 40 45 

Leu Thr Asp Ala Glu Leu Lys Pro Glu Lys Val His Lys Lys Leu Lys 
50 55 60 



Glu Phe Cys Glu Asn Lys Lys Ala Asp Ser Lys Cys Lys Glu Leu Lys 
65 ™ 75 80 

Glu Lys Leu Thr Gin Lys Cys Thr Ala He Lys Gly Lys Leu Thr Glu 
85 90 95 

Ala He Lys Lys Lys Asn Ser Asp Leu Thr Asp Glu Asp Cys Lys Glu 
100 105 110 

Asn Glu Gin Gin Cys Leu Phe Leu Glu Gly Ala Cys Pro Ala Glu Leu 
115 120 125 

Lvs Asp Asp Cys Asn Thr Leu Arg Asn Lys Cys Tyr Gin Lys Lys Arg 
130 135 140 

Asp Lys Val Ala Glu Glu Ala Leu Leu Arg Ala Val Arg Gly Gly Leu 
145 150 155 160 

He Asn Glu Thr Thr Cys Glu Gly Lys Leu Lys Glu Val Cys He Glu 
165 170 175 

Leu Ser Gin Glu Ser Asp Glu Leu Thr Lys Leu Cys Leu Tyr Gin Lys 
180 185 190 
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Met Thr Cys Lys Thr Phe Val Leu Glu Lys Gin Lys Lys Cys Asn Ala 
195 200 20 

Leu Lys Gin Asp Val Asn Ala Ala Leu Glu Lys Lys Asp Glu Leu Arg 
210 215 220 

Gly Lys Cys Leu Pro Leu Leu Glu Arg Cys Tyr Phe Tyr Arg Gly Asn 
225 230 235 

Cys Glu Asp He Ser Lys Cys Asn Lys Ser Ser Glu Asp Cys Tyr Glu 
245 250 

Tyr Leu Pro Val Cys Asp Thr Leu Ala Val Lys Cys Glu Glu Asn Lys 

lie lie Tyr Thr His Pro Gly Ser Asp Phe Asn Pro Thr Lys Ser Lys 
275 



280 285 



Pro Thr 
290 



Val Ala Glu Asp He Gly Leu Glu Glu Leu Tyr Lys Lys Ala 



295 300 



Ala Glu Glu Gly Val His He Gly Lys Pro Pro Val Arg Asp Ala Thr 
305 310 315 

Ala Leu Leu Ala Leu Leu He Gin Asn Leu Asp Pro Lys Ser Gin Val 
325 330 

Gly Lys Glu Cys Glu Lys Val Leu Lys Asp Asn Cys Lys Glu Leu Lys 
340 345 

Ser His Glu lie Leu Gly Asp Phe Cys Asn Gin Asn Val Ala Gly Gin 
355 3 60 365 

Asn Glu He Glu Lys Cys Lys Glu Leu Glu Lys Glu Leu Ala Asn Ser 

370 375 380 

Thr Lys He Leu Phe Glu Lys He Lys Asn Lys His Leu Ser Gly Ser 
385 390 39 

Gly Glu Val He Pro Trp Tyr Lys Leu Thr Thr Phe Leu Ser Asp Asn 

Asp Cys Thr Arg Leu Glu Ser Asp Cys Phe Tyr Leu Lys Ser Gin Ala 
420 425 

Pro Leu Asp Lys Glu Cys Asn Asn Leu Lys Ala Ala Cys Tyr Lys Arg 
**>c 44D 445 



435 



Gly Leu Glu Ala Gin Ala Asn Glu Ala Leu Gin Lys Lys Met Tyr Gly 
J 4 60 



450 455 

Leu 

465 470 



Phe Tyr Gly Ser Gly Lys Glu Trp Phe Lys Lys Leu Leu Glu Lys 
He Met Glu Glu Cys Ser Glu Leu Lys Thr Thr Ser Asp Glu Leu Phe 



485 490 



Leu Leu Cys He Asp Pro Leu Lys Ala Val Arg He Leu Ala Ala Asp 
500 505 

He Gin Ala Arg Ala Val Phe Leu Arg Lys Gin Leu Asp Gin Lys Arg 
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515 



520 



525 



As p Phe Pro Thr Asp Lys Asp Cys Lys Glu Leu Gly Arg Lys Cys Glu 

530 535 
Ala Leu Gly Lys Asp Ser Asn Gin He Lys Trp Pro Cys His Thr Leu 



545 



550 



Lys Gin Gin Cys Asp Arg Leu Gly Thr Thr Glu He Leu Lys Gin Val 
Leu Leu Asp Glu His Lys Asp Thr Leu Arg Thr His Glu Asn Cys Thr 



580 

Lys Tyr Leu Lys Arg Lys Cys His Lys Tr P Ser Arg Arg Gly Asp Asp 

y 595 600 bUD 

Arg Phe Ser Phe Val Cys Val Tyr Gin Asn Ala Thr Cys Lys Leu He 



610 



n - t; a i tv== Asr> Ara Cys Glu Val Phe Glu Lys Asn Met Gin 
Val Asp Asp Val Lys Asp Arg uy* «x ^ 64Q 

62 5 

Ala Ser As P He Asn Asn Ser Leu Lys Asn Lys Gin He Lys Thr Glu 

Ser Ala Ala Asn He Cys Pro Ser Trp His Pro Tyr Cys Asp Arg Phe 
660 665 



Leu Pro Asn Cys Pro Asp Leu Lys Lys Gly Lys Thr Phe Cys Gin Asn 

Leu Lys Lys Tyr Cys Glu Pro Phe Tyr Lys Arg Lys Val Leu Glu Asp 

690 695 
Ala Leu Lys Val Glu Leu Gin Gly Asn Leu Ser Asn Arg Asn Lys Cys 
705 710 715 

Glu Ser Ala Leu Glu Arg Tyr Cys Thr He Leu Lys Asn Val Ser Asp 

Ser Ser He Asn Ser Leu Cys Lys Asp Asn Thr Glu Ser Lys Thr Lys 

740 745 
Lys Thr Asp Asn Glu Val Arg Lys Lys Leu Cys Leu Lys Leu Val Glu 

Glu Val Glu Gin Gin Cys Lys Met Leu Pro Ala Glu Leu Glu His Glu 

770 775 
Glu Lys Asp Leu Lys Asp As P Phe Glu Thr Phe Glu Lys Leu Lys Lys 
785 790 795 

Ala Glu Lys Thr Met Asn Lys Ser Asn Leu Val Leu Ser Phe Val 



Gin 



805 



Lys Lys Asp Glu Asn Asn Thr Ser Lys Asn Ser Ser Lys Asp Lys Asp 



820 



Lys Asn Thr Val Ser Asn Gly Leu Gin Asp Thr Thr Glu His Met Lys 
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Iie Leu Arg Arg Gly Val Lys Asp Val Ser Val Thr Glu Ser Glu Ala 

850 855 860 

Lys Ala Phe Asp Leu Val Ala Glu Val Phe Gly Arg Tyr Leu Asp Leu 
865 870 875 

Lys Glu Arg Cys Asn Lys Leu Glu Ser Asp Cys Arg Val Lys Glu Asp 



885 



Cys Lys Asp Leu Glu Gly Val Cys Gly Lys He Gin Gly Val Cys Ser 

Lys Leu Lys Pro Leu Lys Val Lys Pro His Glu Thr Val Thr Glu Ser 
915 920 925 

Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Val Thr Asp Pro Lys Ala 
930 935 940 

Thr Glu Cys Lys Ser Leu Gin Thr Thr Asp Thr Trp He Thr Gin Thr 
945 950 955 aou 

Ser Thr His Thr Ser Thr Ser Thr lie Thr Ser Thr He Thr Ser Lys 
965 970 9 S 

lie Thr Leu Thr Ser Thr Arg Arg Cys Lys Pro Thr Lys Cys Thr Thr 
980 985 990 

Gly Asp Asp Ala Glu Asp Val Lys Pro Ser Glu Gly Leu Lys Met Ser 
y 995 1000 1005 

Gly 



<210> 11 
<211> 3054 

<212> DNA . . 

<213> Pneumocystis carinii sp. f. homims 

<220> 

<221> CDS 

<222> (1) • • (3054) 

ac^cg^gcg gtc aag egg cag gta aca gga gca tea ggg caa tat gat 48 
Sa Arg Ala Val Lys Arg Gin Val Thr Gly Ala Ser Gly Gin Tyr Asp 



1 



5 10 I 5 



gat gaa gtg aat att ttg gcg ttg att eta caa gaa gat gca atg gaa 
Sp Glu Val Asn He Leu Ala Leu He Leu Gin Glu Asp Ala Met Glu 
P 20 25 30 

cat aca aaa tgc aaa aaa agt tta gaa aaa tac tgc gaa gag ttg aaa 
Sp ?hr Lys Cys Lys Lys Ser Leu Glu Lys Tyr Cys Glu Glu Leu Lys 
35 40 45 

aaa gca tea eta gac atg gaa aaa gta cat aaa atg ctt aaa gat ttc 
ITs ITa Ser Leu Asp Met Glu Lys Val His Lys Met Leu Lys Asp Phe 
50 55 60 

tat gga aat ggg aaa gca agt aaa gca aat aca aaa tgt caa ggt eta 
y 39 



96 



144 



192 



240 
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Cvs Glv Asn Gly Lys Ala Ser Lys Ala Asn Thr Lys Cys Gin Gly Leu 

70 7 5 80 



PCTAJS99/18750 



65 



caa gcc aaa gtt acg ggg aaa tgt aca aat ttt aaa aca caa aag eta 288 
Gin Aia Lys Val Thr Gly Lys Cys Thr Asn Phe Lys Thr Gin Lys Leu 

90 95 



85 



aaa c~a gcg tta aca aat cca tea gat gat aat tgc aaa gag agt gaa 
Glv P-o Ala Leu Thr Asn Pro Ser Asp Asp Asn Cys Lys Glu Ser Glu 
100 105 HO 

cga caa tgc eta ttt ttg gag gga gca tgc cat aat ctt gta gaa gat 
Ara G'n Cys Leu Phe Leu Glu Gly Ala Cys His Asn Leu Val Glu Asp 
H5 120 125 

tgt aac aaa eta agg aat eta tgt tac cag aaa aaa cgt gac gga gta 
Cvs Asn Lys Leu Arg Asn Leu Cys Tyr Gin Lys Lys Arg Asp Gly Val 
130 135 140 

gca gaa gaa gtc ctt ttg agg gca ctt cgt agt gat etc aat aaa aca 
Ala Glu Glu Val Leu Leu Arg Ala Leu Arg Ser Asp Leu Asn Lys Thr 
145 150 155 160 



aaa a~a cat gaa aaa aaa ctg aaa gag att tgc cca gtc ttg cag agg 
Glu Thr His Glu Lys Lys Leu Lys Glu lie Cys Pro Val Leu Gin Arg 
165 l™ 175 

gaa agt aat gaa tta acg gac ttg tgt ttg aac cag aaa aag acg tgc 
Glu Ser Asn Glu Leu Thr Asp Leu Cys Leu Asn Gin Lys Lys Thr Cys 
180 185 190 

aaa aat att ata aaa gaa aaa gat aaa aaa tgc act act ctt aaa gca 
Glu Asn He He Lys Glu Lys Asp Lys Lys Cys Thr Thr Leu Lys Ala 
195 200 205 

aat gtt gca aca gca ctt gga agt ttt aaa aaa gaa ata tgc ctt gaa 
Asn Val Ala Thr Ala Leu Gly Ser Phe Lys Lys Glu He Cys Leu Glu 
210 215 220 

tta ctt gaa caa tgc tat ttt tac att gga aat tgc gga gac gac gat 
Leu Leu Glu Gin Cys Tyr Phe Tyr He Gly Asn Cys Gly Asp Asp Asp 

235 240 



225 



230 



ata act aaa tgt att gaa ttg gga ggg aaa tgc caa gaa caa aac att 
He He Lys Cys He Glu Leu Gly Gly Lys Cys Gin Glu Gin Asn He 
245 250 255 

att tat ata cca cca gga ccc gat ttt gat cca act aga cca gag get 
Val Tyr He Pro Pro Gly Pro Asp Phe Asp Pro Thr Arg Pro Glu Ala 
260 265 270 

aca eta gca gag gac ata gac ctg gat gag ctt tat aaa aag gca gaa 
Thr Leu Ala Glu Asp He Asp Leu Asp Glu Leu Tyr Lys Lys Ala Glu 
275 280 285 

gag gat ggt gtt ttt att gga aaa cat cat tta aga gat gcg aca get 
Glu Aso Gly Val Phe He Gly Lys His His Leu Arg Asp Ala Thr Ala 
290 295 300 

tta ttg acg ttg ttg gtt aag aaa gat gat aca gga aaa aat aat aat 
Leu Leu Thr Leu Leu Val Lys Lys Asp Asp Thr Gly Lys Asn Asn Asn 

40 



336 



384 



432 



480 



528 



576 



624 



672 



720 



768 



816 



864 



912 



960 
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320 

tc aaa qaa aaa tgc aat aag att etc gaa gat aaa , y - aaa aac tct 1008 
Ae ITy Gil lys els Asn Lys He Leu Glu Asp Lys Cys Lys Asn Ser 
325 ^30 



305 310 315 

qaa gat aaa tgc 



caa caq cat gaa get eta aaa aat tta tgt aat aat aat agt cct aat 
g" Gil 5is Glu La Leu Lys Asn Leu Cys Asn Asn Asn Ser Pro Asn 

345 - 33U 



340 



oca tat gga aaa gaa aaa tgc aaa gaa tta gaa gaa gat att aaa aaa 
Ala Tyr Gly Lys Glu Lys Cys Lys Glu Leu Glu Glu Asp He Lys Lys 
3 360 365 



355 



aca tac aca aac etc aaa cca acg att ctt aaa aac cat ctt tat gat 
?hr Cys ?hr Asn Leu Lys Pro Thr He Leu Lys Asn His Leu Tyr Asp 

375 380 



370 



»*- rr*i- att att qaq tag aga aaa ctg cca aca ttt ctt act 

%ll i£ £ "= 111 Si 111 & A Lys Leu Pro T»r Ph. «. TJr 



385 



390 



aat aaa aac tgt gca aga ttg gaa tct tat tgt ttt tac tac gaa aaa 
Asn Glu Asp Cys Ala Arg Leu Glu Ser Tyr Cys Phe Tyr Tyr Glu Lys 
405 410 

act tgt cca aat gec aaa gaa gag tgt atg aat ttg agg gca gcg tgt. 
IS Cys Pro Asn Ala Lys Glu Glu Cys Met Asn Leu Arg Ala Ala Cys 



420 425 



, at . aaa , aa aaa ctt gat gga egg gca aat aaa gtg ctg caa gaa aat 
Tyr Us Arg Sy 2u Asp G?y Arg Ala Asn Lys Val Leu Gin Glu Asn 



435 



r~„r naa tta tta cat got tea aat caa agt tgg ctt aag gag ttt 
Me? Arg G?y Leu Leu Arg Sly Ser Asn Gin Ser Trp Leu Lys Glu Phe 
450 455 460 

c»a caa aaa tta at a aaa gta tgt aag gag eta aaa gaa aat aaa gga 
Tin Gin Arg Leu ?.l Lys Sal Cys Lys Glu Leu Lys Glu Asn Lys Gly 
465 470 

aat ttc cca aac gat gaa ata ttt gtt ctg tgt gta cag cca gca aaa 
Ser Phe Pro Asn Asp Glu lie Phe Val Leu Cys Val Gin Pro Al. Lys 



1056 



1104 



1152 



1200 



1248 



1296 



1344 



1392 



1440 



1488 



act aca cqa tta ctt aca cac gat cat caa atg agg gtt ate ttt tta 
111 Til Arg Leu Leu Thr His Asp His Gin Met Arg Val lie Phe Leu 
500 505 biU 

cqa caa caa ttg gat caa aag aga gat ttt ccg aca gat aaa gac tgc 
A?g Gin Gin Leu Asp Gin Lys Arg Asp Phe Pro Thr Asp Lys Asp Cys 
* 5X5 520 525 

aaa aaa tta ggg aaa aaa tgc caa gat tta gga aag gat tea aaa gaa 
Lys 111 Leu Gl? Lys Lys Cys Gin Asp Leu Gly Lys Asp Ser Lys Glu 
530 535 540 

att aca tgg cca tgt cat acg ctg gag cag caa tgc aat cgc ttg ggg 
ill !E Trp Pro Cys His Thr Leu Glu Gin Gin Cys Asn Arg Leu Gly 



1536 



1584 



1632 



1680 
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act aca qaa att tta aag cag get tta teg gat gaa cac aaa gat act 
Thr Thr Glu He Leu Lys Gin Val Leu Leu Asp Glu His Lys Asp Thr 



565 



570 575 



ttq aaa gac caa gaa agt tgt gta aaa tac eta aaa gaa aag tgt aat 
Leu Lys Asp Gin Glu Ser Cys Val Lys Tyr Leu Lys Glu Lys Cys Asn 
580 585 590 

aaa tgg tct aga aga gga gat gac cgt ttc tct ttt gta tgt gtc ttc 
Lvs Trp Ser Arg Arg Gly Asp Asp Arg Phe Ser Phe Val Cys Val Phe 
y cos 600 605 



caa aac get acg tgt gag ctg atg gta aaa gac gtg aaa gac agg tgt 
Gin Asn Ala Thr Cys Glu Leu Met Val Lys Asp Val Lys Asp Arg Cys 



610 



615 620 



aaa ata ttc aaa aaa aat ata aaa get tea tat att att gaa ttt ctt 
Glu Val Phe Lys Lys Asn lie Lys Ala Ser Tyr He lie Glu Phe Leu 
625 63° 635 640 

qaa aat aat aca aat aaa ata aca aca ctg gaa aga aat tgt ccc tct 
Glu Asn Asn Thr Asn Lys He Thr Thr Leu Glu Arg Asn Cys Pro Ser 
645 650 655 

tgg cat acg tat tgc aat aga ttt tea cct aat tgt cca ggt ctt acg 
Trp His Thr Tyr Cys Asn Arg Phe Ser Pro Asn Cys Pro Gly Leu Thr 
660 665 670 

aaa qaq aat agt tgt aca aaa ate aag aag cat tgt gag ccg ttc tat 
Lys Glu Asn Ser Cys Thr Lys He Lys Lys His Cys Glu Pro Phe Tyr 
675 680 685 

aaa aaa aag gec ttg gaa gat get etc aaa gta gag ctt caa gga aaa 
Lvs Arg Lys Ala Leu Glu Asp Ala Leu Lys Val Glu Leu Gin Gly Lys 
Y 690 695 700 

tta act gat aaa tct aaa tgt gaa cct gca ttg aaa aga tat tgt aca 
Leu Thr Asp Lys Ser Lys Cys Glu Pro Ala Leu Lys Arg Tyr Cys Thr 
705 710 715 720 

qta gcg gga aac gta aat aat gcg tea ate agt ggc tta tgc aaa get 
Val Ala Gly Asn Val Asn Asn Ala Ser He Ser Gly Leu Cys Lys Ala 
725 730 "35 

aac acc aag gat aac tct gga aag agt gat gag gat get aga aag gaa 
Asn Thr Lys Asp Asn Ser Gly Lys Ser Asp Glu Asp Ala Arg Lys Glu 
740 745 750 



1728 



1776 



1824 



1872 



1920 



1968 



2016 



2064 



2112 



2160 



2208 



2256 



2304 



etc tqt gag aaa tta gtg aaa gaa gtg gaa gaa cag tgc aaa gca tta 
Leu Cys Glu Lys Leu Val Lys Glu Val Glu Glu Gin Cys Lys Ala Leu 
755 7 60 7 65 

cca aca gaa tta gga caa ccg gca get gat tta aaa aaa gat tat aag 2352 
Pro Thr Glu Leu Gly Gin Pro Ala Ala Asp Leu Lys Lys Asp Tyr Lys 
770 775 780 

aca tat gag gaa ctt aag aaa cgt gca gag gaa gca atg aac aag tec 2400 
Thr Tyr Glu Glu Leu Lys Lys Arg Ala Glu Glu Ala Met Asn Lys Ser 



785 



790 



795 800 
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aat ctr gtr ttg tea etc att aag aaa aac gaa age aac gta tea aaa 
Ser Leu Val Leu Ser Leu He Lys Lys Asn Glu Ser Asn Val Ser Lys 
805 810 815 



agt aac age aaa aac aag gat aag aat gee gtt tea aac gga ctt caa 
Ser Asn Ser Lys Asn Lys Asp Lys Asn Ala Val Ser Asn Gly Leu Gin 

825 8JO 



820 



gat ace aca aaa cat gtg aaa ata eta egg aga gga gtt aag gat gta 
Asd Thr Thr Lys His Val Lys lie Leu Arg Arg Gly Val Lys Asp Val 
^ 840 845 



835 



tec gta aca gaa tta gaa get aaa gca ttt gat ttg gca gca gaa gta 
Ser Val Thr Glu Leu Glu Ala Lys Ala Phe Asp Leu Ala Ala Glu Val 



850 



855 860 



ttt gga aga tat gta gat ttg aag gaa aga tgt aat aaa ttg gaa tea 
Phe Gly Arg Tyr Val Asp Leu Lys Glu Arg Cys Asn Lys Leu Glu Ser 
865 870 875 880 

gat tgc aga att aag gag gat tgc aaa gac tta gaa gaa gta tgc aaa 
Asp Cys Arg He Lys Glu Asp Cys Lys Asp Leu Glu Glu Val Cys Lys 
885 890 ayo 

aag att aat aag get tgt cgc aat ctg aag cct ctg gag gtg aag ccg 
Lys He Asn Lys Ala Cys Arg Asn Leu Lys Pro Leu Glu Val Lys Pro 
900 905 910 

cac gaa aca gtg aca gaa agt aca acg aca act aca aca aca aca aca 
HLs III ?hr Val Thr Glu Ser Thr Thr Thr Thr Thr Thr Thr Thr Thr 
915 920 925 



2448 



2496 



2544 



2592 



2640 



2688 



2736 



2784 



2832 



acc qtt gec gat ccg aag gca acg gaa tgc aaa tec tta cag aca aca 
?hr Sal Ala Asp Pro Lys Ala Thr Glu Cys Lys Ser Leu Gin Thr Thr 
930 935 940 

qac aca tgg gtt aca cag aca teg aca cac aca age acg tct act ate 2880 
Asp ?hr Trp Val Thr Gin Thr Ser Thr His Thr Ser Thr Ser Thr lie 
945 950 955 960 

aca tct acc ate aca tea aaa ata aca ttg aca tea acg agg cga tgc 2928 
Thr Ser Thr He Thr Ser Lys He Thr Leu Thr Ser Thr Arg Arg Cys 
965 970 975 

aaa eca ace aag tgt acg aca ggg gat gat gca gaa gac gtg aag cea 2976 
Lys Pro Thr Lys Cys Thr Thr Gly Asp Asp Ala Glu Asp Val Lys Pro 
980 985 990 

agt gaa ggc ttg agg gtg age ggg tgg aat gtg atg agg ggg gtg ata 3024 
Ser Glu Gly Leu Arg Val Ser Gly Trp Asn Val Met Arg Gly Val He 
995 1000 1005 

gta gca atg gtt att teg ttc atg att tag 3054 
Val Ala Met Val He Ser Phe Met He 
1010 1015 
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<211> 1017 
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<213> Pneumocystis carinii sp . f. homims 
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A 4 la°U 2 Ala Val Lys Arg Gin Val Thr Gly Ala Ser Gly Gin Tyr Asp 
! 5 10 

Aso Glu Val Asn lie Leu Ala Leu He Leu Gin Glu Asp Ala Met Glu 
20 25 



Asp Thr Lys Cys Lys Lys Ser Leu Glu Lys Tyr Cys Glu Glu Leu Lys 

;iu 

50 55 



35 40 
Ala Ser Leu Asp Met Glu Lys Val His Lys Met Leu Lys Asp Phe 



Lys Ala Ser i.eu asp ^y- -■>- 

50 55 60 

Cys Gly Asn Gly Lys Ala Ser Lys Ala Asn Thr Lys Cys Gin Gly Leu 
65 70 75 

Gin Ala Lys Val Thr Gly Lys Cys Thr Asn Phe Lys Thr Gin Lys Leu 
85 90 * 

Gly Pro Ala Leu Thr Asn Pro Ser Asp Asp Asn Cys Lys Glu Ser Glu 
100 105 110 

Arc, Gin Cys Leu Phe Leu Glu Gly Ala Cys His Asn Leu Val Glu Asp 
y - 120 125 



115 



Cvs Asn Lys Leu Arg Asn Leu Cys Tyr Gin Lys Lys Arg Asp Gly Val. 
y 130 135 140 

Ala Glu Glu Val Leu Leu Arg Ala Leu Arg Ser Asp Leu Asn Lys Thr 
145 I 50 

Glu Thr His Glu Lys Lys Leu Lys Glu He Cys Pro Val Leu Gin Arg 
165 110 

Glu Ser Asn Glu Leu Thr Asp Leu Cys Leu Asn Gin Lys Lys Thr Cys 

180 185 

Glu Asn He lie Lys Glu Lys Asp Lys Lys Cys Thr Thr Leu Lys Ala 
!95 200 205 

Asn Val Ala Thr Ala Leu Gly Ser Phe Lys Lys Glu lie Cys Leu Glu 

210 215 220 

Leu Leu Glu Gin Cys Tyr Phe Tyr He Gly Asn Cys Gly Asp Asp Asp 
225 230 "5 

Tie He Lys Cys lie Glu Leu Gly Gly Lys Cys Gin Glu Gin Asn He 
lie x * 245 250 255 

Val Tyr He Pro Pro Gly Pro Asp Phe Asp Pro Thr Arg Pro Glu Ala 
260 265 

Thr Leu Ala Glu Asp He Asp Leu Asp Glu Leu Tyr Lys Lys Ala Glu 
275 280 285 

Glu Asp Gly val Phe lie Gly Lys His His Leu Arg Asp Ala Thr Ala 
290 295 300 

Leu Leu Thr Leu Leu Val Lys Lys Asp Asp Thr Gly Lys Asn Asn Asn 
305 310 315 

44 



SUBSTITUTE SHEET (RULE 26) 



PCTAJS99/18750 

WO 00/09760 

lie Glv Glu Lvs Cys Asn Lys lie Leu Glu Asp Lys Cys Lys Asn Ser 
325 330 335 

Gin Gin His Glu Ala Leu Lys Asn Leu Cys Asn Asn Asn Ser Pro Asn 
340 345 350 

Ala Tvr Gly Lys Glu Lys Cys Lys Glu Leu Glu Glu Asp lie Lys Lys 
355 360 365 

Thr Cys Thr Asn Leu Lys Pro Thr He Leu Lys Asn His Leu Tyr Asp 
370 375 380 

Pro Asn Asp Lys He Val Glu Trp Arg Lys Leu Pro Thr Phe Leu Thr 
385 390 395 400 

Asn Glu Asp Cys Ala Arg Leu Glu Ser Tyr Cys Phe Tyr Tyr Glu Lys 
405 410 415 

Ala Cys Pro Asn Ala Lys Glu Glu Cys Met Asn Leu Arg Ala Ala Cys 
420 4 25 130 

Tvr Lvs Arg Gly Leu Asp Gly Arg Ala Asn Lys Val Leu Gin Glu Asn 
y 435 440 44 & 

Met Arg Gly Leu Leu Arg Gly Ser Asn Gin Ser Trp Leu Lys Glu Phe 
450 4 55 460 

Gin Gin Arg Leu Val Lys Val Cys Lys Glu Leu Lys Glu Asn Lys Gly 
465 4 ™ 475 480 

Ser Phe Pro Asn Asp Glu lie Phe Val Leu Cys Val Gin Pro Ala Lys 
485 4 »0 4 95 

Ala Ala Arg Leu Leu Thr His Asp His Gin Met Arg Val He Phe Leu 
500 505 510 

Arg Gin Gin Leu Asp Gin Lys Arg Asp Phe Pro Thr Asp Lys Asp Cys 
515 520 525 

Lvs Glu Leu Gly Lys Lys Cys Gin Asp Leu Gly Lys Asp Ser Lys Glu 
530 535 540 



He Thr Trp 
545 



Thr Trp Pro Cys His Thr Leu Glu Gin Gin Cys Asn Arg Leu Gly 
550 555 560 

Thr Thr Glu He Leu Lys Gin Val Leu Leu Asp Glu His Lys Asp Thr 
565 570 575 

Leu Lvs Asp Gin Glu Ser Cys Val Lys Tyr Leu Lys Glu Lys Cys Asn 
580 585 590 

Lvs Trp Ser Arg Arg Gly Asp Asp Arg Phe Ser Phe Val Cys Val Phe 
y 595 600 605 

Gin Asn Ala Thr Cys Glu Leu Met Val Lys Asp Val Lys Asp Arg Cys 
610 615 620 



Glu Val Phe Lys Lys Asn He Lys Ala Ser Tyr He He Glu Phe Leu 
625 



630 635 640 
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Glu Asn As n Thr Asn Lys lie Thr Thr Leu Glu Arg Asn Cys Pro Se*f- 
645 650 655 

Trp His Thr Tyr Cys Asn Arg Phe Ser Pro Asn Cys Pro Gly Leu Thr 



660 



Lys Glu Asn Ser Cys Thr Lys He Lys Lys His Cys Glu Pro Phe Tyr 
Y 675 . 680 685 

Lvs Arg Lys Ala Leu Glu Asp Ala Leu Lys Val Glu Leu Gin Gly Lys 
690 700 

Leu Thr Asp Lys Ser Lys Cys Glu Pro Ala Leu Lys Arg Tyr Cys Thr 
705 710 715 

Val Ala Gly Asn Val Asn Asn Ala Ser He Ser Gly Leu Cys Lys Ala 
725 730 735 

Asn Thr Lys Asp Asn Ser Gly Lys Ser Asp Glu Asp Ala Arg Lys Glu 
740 745 

Leu Cys Glu Lys Leu Val Lys Glu Val Glu Glu Gin Cys Lys Ala Leu 
755 760 765 

Pro Thr Glu Leu Gly Gin Pro Ala Ala Asp Leu Lys Lys Asp Tyr Lys 
770 775 780 

Thr Tyr Glu Glu Leu Lys Lys Arg Ala Glu Glu Ala Met Asn Lys Ser 
785 790 795 800 

Ser Leu Val Leu Ser Leu He Lys Lys Asn Glu Ser Asn Val Ser Lys 
805 810 815 

Ser Asn Ser Lys Asn Lys Asp Lys Asn Ala Val Ser Asn Gly Leu Gin 
820 825 830 

Asp Thr Thr Lys His Val Lys He Leu Arg Arg Gly Val Lys Asp Val 
835 840 845 

Val Thr Glu Leu Glu Ala Lys Ala Phe Asp Leu Ala Ala Glu Val 



Ser 

850 



855 860 



Phe Gly Arg Tyr Val Asp Leu Lys Glu Arg Cys Asn Lys Leu Glu Ser 
865 870 875 

Asp Cys Arg He Lys Glu Asp Cys Lys Asp Leu Glu Glu Val Cys Lys 
885 890 895 

Lys lie Asn Lys Ala Cys Arg Asn Leu Lys Pro Leu Glu Val Lys Pro 
900 905 910 

His Glu Thr Val Thr Glu Ser Thr Thr Thr Thr Thr Thr Thr Thr Thr 
915 920 925 

Thr Val Ala Asp Pro Lys Ala Thr Glu Cys Lys Ser Leu Gin Thr Thr 
930 935 940 

Asp Thr Trp Val Thr Gin Thr Ser Thr His Thr Ser Thr Ser Thr lie 

Thr Ser Thr He Thr Ser Lys He Thr Leu Thr Ser Thr Arg Arg Cys 
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965 



970 9*75 



Lys Pro Thr Lys Cys Thr Thr Gly Asp Asp Ala Glu Asp Val Lys Pro 
980 985 990 

Ser Glu Gly Leu Arg Val Ser Gly Trp Asn Val Met Arg Gly Val He 
995 1000 lOO 5 

Val Ala Met Val He Ser Phe Met He 
1010 i° 15 



<210> 13 
<211> 3072 

<212> DNA . . 

<213> Pneumocystis carinii sp. f. homxnis 

<220> 

<221> CDS 

<222> (1) • • (3072) 

^°g C g 3 cgg gcg gtc aag egg cag gca gca ggg aca cag aat age att 48 
Me? Ill Arg Ala Val Lys Arg Gin Ala Ala Gly Thr Gin Asn Ser He 

10 



1 



5 



aat aaq gaa cat gtt tta get tta att eta aag gaa gat gga eta agt 96 
Zt 111 Su His Val Leu Ala Leu He Leu Lys Glu Asp Gly Leu Ser 
20 25 30 

aaa cag gaa tgc aaa aaa aaa eta aaa aaa tat tgc caa gaa ttg act 
111 Gil Glu Cys Lys Lys Lys Leu Lys Lys Tyr Cys Gin Glu Leu Thr 

40 45 



144 



35 



aaa gca aaa eta aat ata gaa caa gta cac aga aaa ctt aaa ggt ttt 
Glu Ala Lys Leu Asn He Glu Gin Val His Arg Lys Leu Lys Gly Phe 
50 55 60 

tac aaa qat gga aaa gca gat aca aaa tgc aaa gaa ctg aaa gee aat 240 
ell S" 1% G?y Lys Ala Asp Thr Lys Cys Lys Glu Leu Lys Ala Asn 



65 



70 



att gag aaa aaa tgt act aca ate aaa gga aaa ctt aaa gaa gca att 288 
Til Gil Lys Lys Cys Thr Thr He Lys Gly Lys Leu Lys Glu Ala He 
85 

aaa aaa aaa att cag att ata acg gat aag gat tgc aaa gag aat gaa 336 
Lvs Lvs Lys He Gin He He Thr Asp Lys Asp Cys Lys Glu Asn Glu 
y 3 105 11° 



100 



caa caa tgc eta ttt ttg gag gga gta tgt tea aaa gaa ctt aaa gat 384 
Gin Gin Cys Leu Phe Leu Glu Gly Val Cys Ser Lys Glu Leu Lys Asp 
115 120 125 

gat tgc aat act ttg aga aat aag tgc tat caa aag aaa cgt gat aaa 432 
ASP Cvs Asn Thr Leu Arg Asn Lys Cys Tyr Gin Lys Lys Arg Asp Lys 
130 135 140 



gtt gcg gaa gaa gtt ctt tta aga_ gca ctt cgt age gat ctt aat go. 

Val 
145 



480 



Val Ala Glu Glu Val Leu Leu Arg Ala Leu Arg Ser Asp Leu Asn Gly 

150 155 lt>u 
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s e e E ss £ - = E « - - « ? " i " ** 

165 17U 



*. „~ =ra aac tta tat ctq aac cag aaa gag aca 

E E E E ffi E E E E £ E «n OX- Ly. C1U - 
180 185 
=> a t- atf tta att qaa aaa gat aag aag tgc ggt act ctt aaa 
I?, Lys Asn fie Leu lie Slu Lys Lp Lys Lys Cys Gly Thr Leu Lys 
195 200 
*. <-+- ara aca eta qqa agt ttt aaa aaa gaa aca tgt ctt 

E E SS E S E E 2, sir ». ly Lys Glu T„r Cys Leu 



^ a rte aaa caa tgc tat ttt tac att gga aat tgc gga gac gac 
SI Leu Tel ITu Tin c'ys Tyr Phe Tyr He Gly Asn Cys Gly Asp Asp 



720 



225 



230 



e s: ;s e | e e e e e - e - ee e 166 

816 



s e E k s e a; s s e E s e s e a 

260 265 

E E E E S E E E E E E E E E E E 
E E S E E E E E E E E E E E E E 



995 3°° 

290 



. «.«.„ tta aca ttq ttg ate caa gat tct agt ctt aaa aaa aaa gac 960 
& HI HI IT, HI L.« He Gin Asp Ser Ser Leu Lys Lys Lys Asp 



305 



310 

1008 



5 E E S E E S E E E S E E S E S 

E E E E E E E E E E E E E E E E 

340 345 

E E E S E E E E E E E E E E E E 

355 360 J& 

„ = = al -f i-1-c act tea aaa gtc act aat aat cgt ctt ttt gat 

S Cys Lys lie Phe S Ser Lys val Thr Asn A.„ Arg Leu Phe Asp 
370 375 

S E E E E E E S E E S E E E S E 



385 390 395 
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rrr ctt aqc aac aaa gat tgt gcg aaa ttg gag tec tat tgt ttc tat 
Phe III Ser Asn Glu Lp Cys Ala Lys Leu Glu Ser Tyr Cys Phe Tyr 
405 410 



ttt gaa aaa aaa tgt cca gat gga gaa aat gca tgt aaa aat ata aga 
Phe G?u Lys Lys Cys Pro Asp Gly Glu Asn Ala Cys Lys Asn He Arg 

425 qju 



420 



aca aca tgt tac aaa aga gga ctt gat gca egg gca aat aaa gtg ctg 
ITa ?hr Cys Tyr Lys Arg Gly Leu Asp Ala Arg Ala Asn Lys Val Leu 
435 440 445 

r. a * aaa aat atq cga gga atg tta cat ggt tea aac aaa age tgg ctt 
Tin Glu Asn Me? Arg Sy Met Leu His Gly Ser Asn Lys Ser Trp Leu 
450 455 460 

aaa aaa ttt caa caa gaa tta gta aaa gta tgt gag aaa ctg aaa aaa 
Glu Lys Phe Gin Gin Glu Leu Val Lys Val Cys Glu Lys Leu Lys Lys 
465 470 

aaa aac aaa gga agt ttc tea aac gat gaa tta ttt att ctg tgt gta 
g" IS Lys G?y Ser Phe Ser Asn Asp Glu Leu Phe He Leu Cys Val 

485 490 



1248 



1296 



1344 



1392 



1440 



1488 



1536 



cag cca gca aaa gca gec egg ttg ctt aca cat gat ctt cga atg aaa 
Gil Pro Ala Lys Ala Ala Arg Leu Leu Thr His Asp Leu Arg Met Lys 
500 505 

act ate ttt tta cga caa caa ctg gat caa aag cga gat ttc ccg aca 1584 
IS 111 Ht Leu Arg Gin Gin Leu Asp Gin Lys Arg Asp Phe Pro Thr 
515 520 525 

gat aaa aat tgc aag gaa ttg ggg aga aag tgc caa gat tta gga gag 1632 
Lp Lys Asn Cys Lys Glu Leu Gly Arg Lys Cys Gin Asp Leu Gly Glu 
530 535 



aaa aaa att aca tgg cca tgt cat aca ctg gag cag caa tgc 
AsP s" Lys" 111 He Thr Trp Pro Cys His Thr Leu Glu Gin Gin Cys 



545 



550 555 560 



1680 



1728 



aat cqc ttg ggg act aca gaa att tta aag cag gtt tta ttg gat gaa 
ten Sg III 111 Thr Thr Glu He Leu Lys Gin Val Leu Leu Asp Glu 
565 570 

cac aaa gat act ttg aaa gac caa gaa agt tgt gta aaa tac eta aaa 1776 
His lys Asp Thr Leu Lys Asp Gin Glu Ser Cys Val Lys Tyr Leu Lys 

oaa aaq tgt aat aaa tgg tct aga aga gga gat gac cgt ttc tct ttt 1824 
Tlu lys Cys Asn Lys t£ Ser Arg Arg Gly Asp Asp Arg Phe Ser Phe 
595 600 605 

gta tgt gtc ttc caa aac get aeg tgt gag ctg atg gta aaa gac gtg 1872 
Val Cys Val Phe Gin Asn Ala Thr Cys Glu Leu Met Val Lys Asp Val 
610 615 620 

aaa aac agg tgt gaa gta ttc aaa aaa aat ata aaa get tea tat att 1920 
lys Sp Arg Cys Glu Val Phe Lys Lys Asn lie Lys Ala Ser Tyr lie 
625 630 635 

att gaa ttt ctt gaa aat aat aca aat aaa ata aca aca ctg gaa aga 1968 
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lie Glu Phe Leu Glu Asn Asn Thr Asn Lys lie Thr Thr Leu Glu Arg 
645 650 655 

aat tgt ccc tct tgg cat acg tat tgc aat aga ttt tea cct aat tgt 2016 
A^n Cys Pro Ser Trp His Thr Tyr Cys Asn Arg Phe Ser Pro Asn Cys 
660 665 60 

cca aat ctt acg aaa gag aat agt tgt aca aaa ate aag aag cat cgt 2064 
Pro Gly Leu Thr Lys Glu Asn Ser Cys Thr Lys He Lys Lys His Arg 
6 75 680 685 

gag ccg ttc tat aaa aga aag gec ttg gaa gat get etc aaa gta gag 2112 
Glu Pro Phe Tyr Lys Arg Lys Ala Leu Glu Asp Ala Leu Lys Val Glu 
690 695 700 

ctt caa gga aaa ttg act gat aaa tct aaa tgt gaa cct gca ttg aaa 2160 
III Gin Gly Lys Leu Thr Asp Lys Ser Lys Cys Glu Pro Ala Leu Lys 
705 710 715 720 

aga tat tgt aca gta gcg gga aac gta aat aat gcg tea ate agt ggc 2208 



Arg Tyr Cys Thr Val Ala Gly Asn Val Asn Asn Ala Ser lie Ser Gly 
725 730 735 

tta tgc aaa get aac ace aag gat aac tct gga aag agt gat gag gat 
Leu Cys Lys Ala Asn Thr Lys Asp Asn Ser Gly Lys Ser Asp Glu Asp 
* 745 750 



740 



act aga aag gaa etc tgt gag aaa tta gtg aaa gaa gtg gaa gaa cag 
Ala Axg Lys Glu Leu Cys Glu Lys Leu Val Lys Glu Val Glu Glu Gin 
755 760 765 

tgc aaa gca tta cca aca gaa tta gga caa ccg gca get gat eta aaa 
Cvs Lvs Ala Leu Pro Thr Glu Leu Gly Gin Pro Ala Ala Asp Leu Lys 
Y 770 775 780 

aaa aat tat aag aca tat gag gaa ctt aag aaa cgt gca gag gaa gca 
Lys III Tyr Lyl Thr Tyr Glu Glu Leu Lys Lys Arg Ala Glu Glu Ala 
785 790 795 800 

ata aac aag tec agt ctt gtt ttg tea etc att aag aaa aac gaa agt 
Met Asn Lys Ser Ser Leu Val Leu Ser Leu He Lys Lys Asn Glu Ser 
805 810 815 

aat gta tea aaa agt aat age aaa aac aag gat aag aat gee gtt tea 
Asn Val Ser Lys Ser Asn Ser Lys Asn Lys Asp Lys Asn Ala Val Ser 
820 825 830 

aac aga ctt caa gat aec aca aaa cat gtg aaa ata eta egg agg gga 
Asn Gly Leu Gin Asp Thr Thr Lys His Val Lys He Leu Arg Arg Gly 
835 840 845 

att aag gat gta tec gta aca gaa tta gaa get aaa gca ttt gat ttg 
Val Lys Asp Val Ser Val Thr Glu Leu Glu Ala Lys Ala Phe Asp Leu 
850 855 860 

aca gca gaa gta ttt gga aga tat gta gat ttg aag gaa aga tgt aat 
Ala Ala Glu Val Phe Gly Arg Tyr Val Asp Leu Lys Glu Arg Cys Asn 
865 870 875 880 



aaa ttg gaa tea gat tgc aga att aag gag gat tgc aaa gac tta gaa 
Lvs Leu Glu Ser Asp Cys Arg He Lys Glu Asp Cys Lys Asp Leu Glu 
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2256 



2304 



2352 



2400 



2448 



2496 



2544 



2592 



2640 
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885 



890 



895 



gaa gta tgc aaa aag att aat aag get tgt cgc aat ctg aag cct ctg 2736 
Glu Val Cys Lys Lys He Asn Lys Ala Cys Arg Asn Leu Lys Pro Leu 
900 905 910 

gag gtg aag ccg cac gaa aca gtg aca gaa agt aca acg aca act aca 2784 
Glu Val Lys Pro His Glu Thr Val Thr Glu Ser Thr Thr Thr Thr Thr 
915 920 925 

aca aca aca aca acc gtt gec gat ccg aag gca acg gaa tgc aaa tec 2832 
Thr Thr Thr Thr Thr Val Ala Asp Pro Lys Ala Thr Glu Cys Lys Ser 
930 935 940 

tta cag aca aca gac aca tgg gtt aca cag aca teg aca cac aca age 2880 
Leu Gin Thr Thr Asp Thr Trp Val Thr Gin Thr Ser Thr His Thr Ser 
945 950 955 960 

acg tct act ate aca tct acc ate aca tea aaa ata aca ttg aca tea 2928 
Thr Ser Thr He Thr Ser Thr He Thr Ser Lys He Thr Leu Thr Ser 
965 970 975 

acg agg cga tgc aaa cca acc aag tgt acg aca gga gag gaa gat gat 2976 
Thr Arg Arg Cys Lys Pro Thr Lys Cys Thr Thr Gly Glu Glu Asp Asp 
980 985 990 



gca gga gac gtg aaa ccg agt gag ggg ctg agg atg agt ggg tgg aat 
Ala Gly Asp Val Lys Pro Ser Glu Gly Leu Arg Met Ser Gly Trp Asn 
995 1000 1005 

gtg atg agg ggg gtg ata gta gca atg gtt att teg ttc atg att tag 
Val Met Arg Gly Val He Val Ala Met Val He Ser Phe Met He 
1010 1015 1020 



3024 



3072 



<210> 14 
<211> 1023 
<212> PRT 

<213> Pneumocystis carinii sp. f. hominis 
<400> 14 

Met Ala Arg Ala Val Lys Arg Gin Ala Ala Gly Thr Gin Asn Ser He 
1 5 10 15 

Asp Glu Glu His Val Leu Ala Leu He Leu Lys Glu Asp Gly Leu Ser 
20 25 30 



Glu Gin Glu Cys Lys Lys Lys Leu Lys Lys Tyr Cys Gin Glu Leu Thr 
35 40 45 

Glu Ala Lys Leu Asn He Glu Gin Val His Arg Lys Leu Lys Gly Phe 
50 55 60 

Cys Glu Asp Gly Lys Ala Asp Thr Lys Cys Lys Glu Leu Lys Ala Asn 
65 70 75 80 

He Glu Lys Lys Cys Thr Thr He Lys Gly Lys Leu Lys Glu Ala He 
85 90 95 

Lys Lys Lys He Gin He He Thr Asp Lys Asp Cys Lys Glu Asn Glu 
100 105 HO 
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Gin Gin Cys Leu Phe Leu Glu Gly Val Cys Ser Lys Glu Leu Lys Asp 
115 120 125 

Asp Cys Asn Thr Leu Arg Asn Lys Cys Tyr Gin Lys Lys Arg Asp Lys 
130 135 140 

Val Ala Glu Glu Val Leu Leu Arg Ala Leu Arg Ser Asp Leu Asn Gly 
145 150 155 160 

Ser Val He Cys Glu Lys Lys Leu Lys Glu He Cys Pro Val Met Gly 
165 170 175 

Arg Glu Ser Asp Glu Leu Thr Asn Leu Cys Leu Asn Gin Lys Glu Thr 
180 185 190 

Cys Lys Asn He Leu He Glu Lys Asp Lys Lys Cys Gly Thr Leu Lys 
195 200 205 

Thr Asp Val Ser Ala Ala Leu Gly Ser Phe Lys Lys Glu Thr Cys Leu 
210 215 220 

Glu Leu Leu Glu Gin Cys Tyr Phe Tyr He Gly Asn Cys Gly Asp Asp 
225 230 235 240 

Asp He He Lys Cys He Glu Leu Gly Gly Lys Cys Gin Glu Gin Asn 
245 250 255 

He Ala Tyr Met Pro Pro Gly Pro Asp Phe Asp Pro Thr Arg Pro Glu 
260 265 270 

Ala Thr He Ala Glu Asp He Gly Leu Glu Glu Phe Tyr Lys Lys Val 
275 280 285 

Glu Glu Asp Gly Val Phe He Gly Lys Asn His Leu Arg Asp Ala Thr 
290 295 300 

Ala Leu Leu Ala Leu Leu He Gin Asp Ser Ser Leu Lys Lys Lys Asp 
305 310 315 320 

Asp Lys Glu Lys Cys Glu Glu Ala Leu Gin Lys Ser Cys Lys Asn Pro 
325 330 335 

His Glu His Glu Ala Leu Glu Ser Leu Cys Lys Lys Asn Gly Leu Ser 
340 345 350 

Asn Asp Gly Thr Lys Lys Cys Glu Glu Leu Gin Asn Asp He Asn Lys 
355 360 365 

Thr Cys Lys He Phe Thr Ser Lys Val Thr Asn Asn Arg Leu Phe Asp 
370 375 380 

Pro Thr Lys Gly Asn Asn Glu He Val Gly Trp Glu Gly Leu Pro Thr 
385 390 395 400 

Phe Leu Ser Asn Glu Asp Cys Ala Lys Leu Glu Ser Tyr Cys Phe Tyr 
405 410 415 

Phe Glu Lys Lys Cys Pro Asp Gly Glu Asn Ala Cys Lys Asn lie Arg 
420 425 430 
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Ala Thr Cys Tyr Lys Arg Gly Leu Asp Ala Arg Ala Asn Lys Val LeU 
435 440 445 

Gin Glu Asn Met Arg Gly Met Leu His Gly Ser Asn Lys Ser Trp Leu 
450 455 460 

Glu Lys Phe Gin Gin Glu Leu Val Lys Val Cys Glu Lys Leu Lys Lys 
465 470 475 480 

Glu Asn Lys Gly Ser Phe Ser Asn Asp Glu Leu Phe lie Leu Cys Val 
485 490 495 

Gin Pro Ala Lys Ala Ala Arg Leu Leu Thr His Asp Leu Arg Met Lys 
500 505 510 

Thr lie Phe Leu Arg Gin Gin Leu Asp Gin Lys Arg Asp Phe Pro Thr 
515 520 525 

Asp Lys Asn Cys Lys Glu Leu Gly Arg Lys Cys Gin Asp Leu Gly Glu 
530 535 540 

Asp Ser Lys Glu lie Thr Trp Pro Cys His Thr Leu Glu Gin Gin Cys 
545 550 555 560 

Asn Arg Leu Gly Thr Thr Glu lie Leu Lys Gin Val Leu Leu Asp Glu 
565 570 575 

His Lys Asp Thr Leu Lys Asp Gin Glu Ser Cys Val Lys Tyr Leu Lys 
580 585 590 

Glu Lys Cys Asn Lys Trp Ser Arg Arg Gly Asp Asp Arg Phe Ser Phe 
595 600 605 

Val Cys Val Phe Gin Asn Ala Thr Cys Glu Leu Met Val Lys Asp Val 
610 615 620 

Lys Asp Arg Cys Glu Val Phe Lys Lys Asn lie Lys Ala Ser Tyr lie 
625 630 635 640 

lie Glu Phe Leu Glu Asn Asn Thr Asn Lys lie Thr Thr Leu Glu Arg 
645 650 655 

Asn Cys Pro Ser Trp His Thr Tyr Cys Asn Arg Phe Ser Pro Asn Cys 
660 665 670 

Pro Gly Leu Thr Lys Glu Asn Ser Cys Thr Lys lie Lys Lys His Arg 
675 680 685 

Glu Pro Phe Tyr Lys Arg Lys Ala Leu Glu Asp Ala Leu Lys Val Glu 
690 695 700 

Leu Gin Gly Lys Leu Thr Asp Lys Ser Lys Cys Glu Pro Ala Leu Lys 
705 710 715 720 

Arg Tyr Cys Thr Val Ala Gly Asn Val Asn Asn Ala Ser lie Ser Gly 
725 730 735 

Leu Cys Lys Ala Asn Thr Lys Asp Asn Ser Gly Lys Ser Asp Glu Asp 
740 745 750 

Ala Arg Lys Glu Leu Cys Glu Lys Leu Val Lys Glu Val Glu Glu Gin 
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* 

755 760 765 



Cys Lys Ala Leu Pro Thr Glu Leu Gly Gin Pro Ala Ala Asp Leu Lys 
770 775 780 

Lvs Asp Tyr Lys Thr Tyr Glu Glu Leu Lys Lys Arg Ala Glu Glu Ala 
785 790 795 800 

Met Asn Lys Ser Ser Leu Val Leu Ser Leu He Lys Lys Asn Glu Ser 
805 810 815 

Asn Val Ser Lys Ser Asn Ser Lys Asn Lys Asp Lys Asn Ala Val Ser 
820 825 830 

Asn Gly Leu Gin Asp Thr Thr Lys His Val Lys He Leu Arg Arg Gly 
835 840 845 

Val Lys Asp Val Ser Val Thr Glu Leu Glu Ala Lys Ala Phe Asp Leu 
850 855 860 

Ala Ala Glu Val Phe Gly Arg Tyr Val Asp Leu Lys Glu Arg Cys Asn 
865 870 875 880 

Lvs Leu Glu Ser Asp Cys Arg He Lys Glu Asp Cys Lys Asp Leu Glu 
885 890 895 

Glu Val Cys Lys Lys He Asn Lys Ala Cys Arg Asn Leu Lys Pro Leu 
900 905 910 

Glu Val Lys Pro His Glu Thr Val Thr Glu Ser Thr Thr Thr Thr Thr 
915 920 925 

Thr Thr Thr Thr Thr Val Ala Asp Pro Lys Ala Thr Glu Cys Lys Ser 
930 935 940 

Leu Gin Thr Thr Asp Thr Trp Val Thr Gin Thr Ser Thr His Thr Ser 
945 950 955 960 

Thr Ser Thr He Thr Ser Thr He Thr Ser Lys He Thr Leu Thr Ser 
965 970 975 

Thr Arg Arg Cys Lys Pro Thr Lys Cys Thr Thr Gly Glu Glu Asp Asp 
980 985 990 

Ala Gly Asp Val Lys Pro Ser Glu Gly Leu Arg Met Ser Gly Trp Asn 
995 1000 1005 

Val Met Arg Gly Val He Val Ala Met Val He Ser Phe Met He 
1010 1015 1020 



<210> 15 
<211> 249 
<212> DNA 

<213> Pneumocystis carinii sp. f* hominis 

<220> 

<221> CDS 

<222> (1) . . (249) 

<400> 15 
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gag tgc caa tct ctg cag acg aca gac acg tgg gtc aca aag acg teg 48 
Glu Cys Gin Ser Leu Gin Thr Thr Asp Thr Trp Val Thr Lys Thr Ser 
15 10 15 

acc cat act age act tct acg act acg tec aca gtc aca teg aga ata 96 
Thr His Thr Ser Thr Ser Thr Thr Thr Ser Thr Val Thr Ser Arg lie 
20 25 30 

aca etc acc tea acg agg egg tgt aag cct acg aag tgt acg aca gga 144 
Thr Leu Thr Ser Thr Arg Arg Cys Lys Pro Thr Lys Cys Thr Thr Gly 
35 40 45 

gag gaa gat gat gca gga gag gtg aag ccg agt gaa ggg ctg agg atg 192 
Glu Glu Asp Asp Ala Gly Glu Val Lys Pro Ser Glu Gly Leu Arg Met 
50 55 60 

agt ggg tgg agt gtg atg agg ggg gtg tta tta gca atg atg att tea 240 
Ser Gly Trp Ser Val Met Arg Gly Val Leu Leu Ala Met Met lie Ser 
65 70 75 80 

ttc atg att 249 
Phe Met lie 



<210> 16 
<211> 83 
<212> PRT 

<213> Pneumocystis carinii sp. f. hominis 
<400> 16 

Glu Cys Gin Ser Leu Gin Thr Thr Asp Thr Trp Val Thr Lys Thr Ser 
15 10 15 

Thr His Thr Ser Thr Ser Thr Thr Thr Ser Thr Val Thr Ser Arg lie 
20 25 30 

Thr Leu Thr Ser Thr Arg Arg Cys Lys Pro Thr Lys Cys Thr Thr Gly 
35 40 45 

Glu Glu Asp Asp Ala Gly Glu Val Lys Pro Ser Glu Gly Leu Arg Met 
50 55 60 

Ser Gly Trp Ser Val Met Arg Gly Val Leu Leu Ala Met Met lie Ser 
65 70 75 80 

Phe Met lie 



<210> 17 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
oligonucleotide 

<400> 17 

gaatgcaaat ccttacagac aacag 25 
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<210> 18 
<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
oligonucleotide 

<400> 18 

gaatgcaaat ctttacagac aacag 



<210> 19 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
oligonucleotide 

<400> 19 

tgcaaaccaa ccaagtgtac gacagg 



<210> 20 
.<211> 25 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 



<400> 20 

aaatcatgaa cgaaataacc attgc 



<210> 21 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
oligonucleotide 

<400> 21 

tttcatatgg cgcgggcggt caagcggcag 



<210> 22 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
oligonucleotide 



oligonucleotide 
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<400> 22 30 
ctaaatcatg aacgaaataa ccattgctac 



<210> 23 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
oligonucleotide 

<400> 23 24 
gaattcgatc tgaagcctct ggag 

<210> 24 
<211> 24 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
oligonucleotide 

<400> 24 24 
ttctagaaac ccactcatct tcaa 

<210> 25 
<211> 22 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
peptide 

^s G Me"Tyr Gly Leu Phe Tyr Gly Ser Gly Lys Glu Trp Phe Lys Lys 
J 5 10 15 



1 



Leu Leu Glu Lys lie Met 
20 



<210> 26 
<211> 15 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: synthetic 
peptide 

£lr°^le 6 Thr Ser Thr He Thr Ser Lys He Thr Leu Thr Ser Thr 
An ^ 5 10 I 5 
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